Finalizing the UData serialization #326
Replies: 5 comments 2 replies
-
Some initial feedback.
|
Beta Was this translation helpful? Give feedback.
-
I'm still kind of confused... Okay, so based on the "Compact Udata for a block message", the hash indeed doesn't have to be repeated. What's the point of explaining non-compact udata serialization format anyway? When a block is mined, nodes exchange compact udata for block. When unconfirmed tx is produced, a compact udata for tx could be transmitted.
I mean, we're building a new system, why just don't make it right? For example, amount cant be negative, right? And it's not related to consensus, so might use uint.
That's what I thought, so I was surprised it's not reflected anywhere in this serialization. Perhaps you depend on the regular Bitcoin Core message ordering in that case?
Yeah I guess I see where you're coming from. This is not critical anyway, I just thought it's a typo but yeah it's not. |
Beta Was this translation helpful? Give feedback.
-
I see now yeah. Overall, I have no more comments, and I hope this discussion did some help. It would be useful to revisit this once I catch up with the utreexo context better :) |
Beta Was this translation helpful? Give feedback.
-
I don't think we need the unconfirmed marker, as nodes can check if something is unconfirmed by looking it up in their mempool. You compute an input skip list for a transaction by checking which TxIns come from the mempool. That way you can tell which leaf data belongs to which input, allowing you to only send the required/confirmed leaves. Orphans are detected if the size of the skip list plus the number of provided leaves does not match the total number of TxIns. |
Beta Was this translation helpful? Give feedback.
-
Just make the length of the accumulator proof 0 for unconfirmed spends. It cannot be non-zero anyways. |
Beta Was this translation helpful? Give feedback.
-
This is a first attempt at finalizing the serialization for UData to work towards
a spec for Utreexo. Looking for reviews.
UData Serialization
UData just stands for 'Utreexo Data' and includes all the information
that is needed by a verifying node to verify a Bitcoin block with only
the headers and the Utreexo accumulator roots.
The data that is needed is essentially just three things:
that each TxIn in the block is referencing.
to verify bitcoin spending conditions.
for caching.
The serialized format is:
All together, the udata serialization looks like so:
Each of these elements follow their own serialization format which is defined below.
Accumulator Proof Serialization
Accumulator proof is called
BatchProof
in package accumulator and itsserialization format is:
The batchproof serialization looks like so:
Leaf Data Serialization
Leaf datas are essentially the revXXXXX.dat block data with the exclusion of
same block spends(saves ~20%). BlockHash and outpoint are included on top of
the revXXXXX.dat block data as those also must be included in the hash
commitment to be included in the accumulator.
The serialization format is:
The outpoint serialized format is:
The serialized header code format is:
bit 0 - containing transaction is a coinbase
bits 1-x - height of the block that contains the spent txout
It's calculated with:
All together, the serialization looks like so:
TXO Time-To-Live Value Serialization
The txo time-to-live values are how long each txo lasts until it is spent. This
information is needed for caching and saves massive amount of bandwidth when
a peer uses the ttl values for caching.
The serialization format is:
The serialization looks like so:
Compact UData Serialization
Compact UData serialization includes only the data that is missing for a
utreexo node to verify a block or a tx with only the utreexo roots. The
compact serialization leaves out data that is able to be fetched locally
by a node, saving bandwidth and storage space.
Note that compact UData serialization differs for a block message and for a
transaction message. This is because transaction messages may reference TXOs
that are not yet included in a block. If a transaction is not included in a block,
there is no accumulator proof for it.
Because of this, each serialization differs to optimize bandwidth savings.
Compact UData Serialization for a block message.
The compact UData serialization for a block is the same as a normal udata
serialization except for the fact that it uses the compact leaf data serialization.
The serialized format for a block is:
Serialization looks like so:
Note that this information is essentially the same as what's included in the
revXXXXX.dat block (except for the removal of same block spends).
Compact UData Serialization for a transaction message.
Transaction messages may reference inputs that are not yet included in a block (ex: CPFP txs).
This results in some inputs not needing any UData. For these, we just replace with a single byte
unconfirmed marker.
The serialized format for a transaction is:
All other fields with the exception of 'unconfirmed marker' is the same as
the serialization for a block. The unconfirmed marker is represented in
the struct as height = -1.
We need this unconfirmed marker as if we don't, the receiver of the UData won't
know which accumulator proof/leaf data is for which TxIn. For example, if we
have a transaction with 3 inputs with one of that input referencing a
transaction not yet confirmed in a block, then we will only have 2 proofs/leaf
datas.
However, a Compact State Node won't know which of the 3 TxIns are unconfirmed.
We don't include the outpoint in the compact leaf data serialization so there's
no way to tell. So the solution is to force there to be an equal amount of
accumulator proof/leaf data and TxIns and they also must be sent in the same
permutation. If a UTXO being referenced is unconfirmed, then it will have an
unconfirmed marker of 0x1. If the UTXO being referenced is confirmed, then it
will have an unconfirmed marker of 0x0 with the actual data following it.
Beta Was this translation helpful? Give feedback.
All reactions