Skip to content

Digram Serialization

F. Conrads edited this page Sep 14, 2019 · 27 revisions

digram-ser.png

To serialize several Digrams we simply add the serialized Digrams to each other.

Complement of Edge Label Index 1

To set the start of a digram, we simply use the complement of the first edge label index. Hence to seperate two digrams, we can search for the next Integer with the leading bit. This is then the next ELI1 and the start of the next digram, thus setting the boundaries of the internals.

IAS

IAS is the Internal Array Size Flag which represents if the Digram has two or one internals.

  • 0 for 1 Internal Node
  • 1 for 2 Internal Nodes

Size Flag

The size flag represents the Number Class of the internals, if the internals are all smaller than the Maximum amount possible using a short, it would be a waste of space to save them as Integers. So the internals will be saved in the lowest number class all of them can be contained in.

  • 00 for byte
  • 01 for short
  • 10 for integer
  • 11 for long

External Index

An external index has 2 bits, as we exploit the fact that we only have 1 or 2 external indexes The indexes are one of [1,2,3,4] which can be represented as:

  • 1: 00
  • 2: 01
  • 3: 10
  • 4: 11

In the case that we only have one external Index, external Index 2 will be set to the same as external 1.

compression rate

In the following graph the compression ratio of this method in comparison to saving each Digram Occurences naively as a 24 byte object (N1 E1 N2 N3 E2 N4) is shown: this_approach/naive

digram-ratio.png

Example

A digram where the first node is an external and the last node, with the edges named :p1 and :p2 and their indexes 1 and 2 has the following occurences: :n1 :p1 :n2 ; :p2 :n3, :n4 :p1 :n5 ; :p2 :n6 will be saved as signed values (using | for seperation, spaces are only for representation of bytes wont be saved )

-1|2|-125|2|5

or as bits

11111111 11111111 11111111 11111111|00000000 00000000 00000000 00000010|10000011|00000010|00000101

or as hex

FF FF FF FF 00 00 00 02 83 02 05

Clone this wiki locally