-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XDM with MMR proofs: 1 #2508
XDM with MMR proofs: 1 #2508
Conversation
2f2ace6
to
5ee10ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the idea here, but I do not believe host functions are necessary. Left some other minor comments and didn't review last 3 commits at all.
crates/pallet-domains/src/staking.rs
Outdated
@@ -1231,9 +1233,10 @@ where | |||
|
|||
#[cfg(test)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not about this PR, but tests should generally be in their own module. It is a better dev experience (library doesn't recompile when tests change) and cleaner overall.
opaque_leaf: EncodableOpaqueLeaf, | ||
proof: Proof<mmr::Hash>, | ||
) -> Option<Hash> { | ||
let leaf: mmr::Leaf = opaque_leaf.into_opaque_leaf().try_decode()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way it is written works and is correct, but I find placing types where they come from rather than making compiler infer them backwards easier to read, looks more linear:
let leaf: mmr::Leaf = opaque_leaf.into_opaque_leaf().try_decode()?; | |
let leaf = opaque_leaf.into_opaque_leaf().try_decode::<mmr::Leaf>()?; |
There are also other places like this in the code.
This is similarly to let x = X::from(y)
being more readable than let x: X = y.into()
, just try to read it out loud, it is harder to do smoothly with .into()
. In most cases : Type
is not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm that is fair. For me this approach always made a lot of sense from reading the perspective since I know what is the final type that is being assigned to the variable instead of going through till the end to know what this type is.
This is similarly to let x = X::from(y) being more readable than let x: X = y.into(),
This I agree since the position of the final type is same in both the scenarios and readability wise, its much more cleaner to read.
Having said that, I dont have a strong opinion but rather a personal preference
let leaf: MmrLeaf<ConsensusBlockNumber, ConsensusBlockHash> = | ||
opaque_leaf.into_opaque_leaf().try_decode()?; | ||
let state_root = leaf.state_root(); | ||
verify_mmr_proof(vec![EncodableOpaqueLeaf::from_leaf(&leaf)], proof.encode()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using Mmr::verify_leaves
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is domain runtime and they do not have access to Mmr::verify_leaves
unlike Consensus chain runtime. Hence, we have the host functions to verify such proofs for domains
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not have right now, but what prevents us from including it as a dependency in domain runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I completely follow you here. I have explained above in other comment why we I chose to go this route of using host function to verify the proof
StorageKeyRequest::ConfirmedDomainBlockStorageKey(domain_id) => runtime_api | ||
.confirmed_domain_block_storage_key(best_hash, domain_id) | ||
.map(Some), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand this host function either, comment is the same as for previous hash functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This host function dates back some time when we had a discussion on assumed storage keys where we were defining the mocked storages and deriving their storage keys to verify the storage proof.
Here in messenger, when a message comes from Domain_a to Domain_b, previously Domain_b was deriving the storage key from the its own storage. This made to have the pallet messenger definition same across all the different runtimes.
What this host function does here is, Domain_b uses the the Domain_a runtime from the consensus chain and then use its Stateless API to get the storage key. This approach does not hold us back with such strict assumptions where pallets are defined exactly the same across runtimes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the benefits of decoupling, but I'm confused about what this has to do with domains, this seems to call consensus client runtime API 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the domain_proof
is a storage proof derived from LatestConfirmedDomainBlock
on consensus runtime. We are using consensus runtime api to get that storage key while verifying the domain_proof
and the use src domain runtime to fetch the Outbox
and InboxResponse
storage keys to complete the XDM message verification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not believe host functions are necessary.
I tried to explain why the host functions are required over the comments itself but we can do a Sync on this PR specifically if it helps you
@@ -56,4 +63,16 @@ where | |||
extrinsics_root: H256::from_slice(header.extrinsics_root().as_ref()), | |||
}) | |||
} | |||
|
|||
fn verify_mmr_proof(&self, leaves: Vec<EncodableOpaqueLeaf>, encoded_proof: Vec<u8>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why host function is needed for proof generation, but why is it needed for proof verification and why for proof verification you're calling runtime API anyway?
Okay I see the confusion here. I have decided to re-use this implementation of extension for Domain host function. See this definition DomainMmrRuntimeInterface
. The proof verification for consensus chain is directly done on the consensus chain itself. But the proof verification for Domains on the other hand would require host function that uses consensus chain runtime_api to do such verification.
@@ -56,4 +63,16 @@ where | |||
extrinsics_root: H256::from_slice(header.extrinsics_root().as_ref()), | |||
}) | |||
} | |||
|
|||
fn verify_mmr_proof(&self, leaves: Vec<EncodableOpaqueLeaf>, encoded_proof: Vec<u8>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, you're requesting something about best block, which may not even belong to the same fork as the block that made host function call.
For MMR, the proof is strictly self contained in the sense we do not need to use the same consensus chain hash that generated the proof. The reason for such is the proof hold the total leaf count at the time of proof generation. So while verifying, the mmr root is calculated with that leaf count since it is strictly appending. Also, the leaves before a block is finalized will be stored in fork aware fashion and runtime picks what it assumes to be the canonical fork while verifying. So proof will either be valid or invalid.
opaque_leaf: EncodableOpaqueLeaf, | ||
proof: Proof<mmr::Hash>, | ||
) -> Option<Hash> { | ||
let leaf: mmr::Leaf = opaque_leaf.into_opaque_leaf().try_decode()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm that is fair. For me this approach always made a lot of sense from reading the perspective since I know what is the final type that is being assigned to the variable instead of going through till the end to know what this type is.
This is similarly to let x = X::from(y) being more readable than let x: X = y.into(),
This I agree since the position of the final type is same in both the scenarios and readability wise, its much more cleaner to read.
Having said that, I dont have a strong opinion but rather a personal preference
StorageKeyRequest::ConfirmedDomainBlockStorageKey(domain_id) => runtime_api | ||
.confirmed_domain_block_storage_key(best_hash, domain_id) | ||
.map(Some), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This host function dates back some time when we had a discussion on assumed storage keys where we were defining the mocked storages and deriving their storage keys to verify the storage proof.
Here in messenger, when a message comes from Domain_a to Domain_b, previously Domain_b was deriving the storage key from the its own storage. This made to have the pallet messenger definition same across all the different runtimes.
What this host function does here is, Domain_b uses the the Domain_a runtime from the consensus chain and then use its Stateless API to get the storage key. This approach does not hold us back with such strict assumptions where pallets are defined exactly the same across runtimes
let leaf: MmrLeaf<ConsensusBlockNumber, ConsensusBlockHash> = | ||
opaque_leaf.into_opaque_leaf().try_decode()?; | ||
let state_root = leaf.state_root(); | ||
verify_mmr_proof(vec![EncodableOpaqueLeaf::from_leaf(&leaf)], proof.encode()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is domain runtime and they do not have access to Mmr::verify_leaves
unlike Consensus chain runtime. Hence, we have the host functions to verify such proofs for domains
Seems like this PR might have broken the operator tests though I'm not sure if that is the case yet. For now cancelling these CI runs until I can figure out what is what |
@nazar-pc I was just verifying how the
We decided to pick block #10 on gemini-3h and as expected, the Now coming to the crust of the PR and our discussion earlier, we can make the mmr verification stateless by storing the mmr roots on the consensus runtime and prune them on a rolling window, like you pointed out there are some edge cases when it comes to domains where they would need to access such mmr roots to verify these proofs which might not be available when they are required if for example domain halted for long enough period while consensus chain continued to progress. Overall, looks like we would need some form of time limit for XDM by which they should included but if they are not, then that particular channel is completely halted. There will be way to unhalt it by submitting some extension proofs but have not discussed or spec'ed this out yet. I'll try to put some more points again looking at other options but at the moment, I feel the usage of host functions to fetch the offchain data for verification seems more viable. |
Hm... this is very problematic then 😞 We'll need to fix it. And this also explains a lot, thanks!
This was not intentional, but there is no API in Substrate to deliberately prune things we want to prune, rather than everything at a fixed distance from the tip of the chain.
I think time limit is necessary either way due to the fact that we are pruning MMR data in offchain storage anyway and we will prune state once we can as well. |
I have spent some time to understand how MMR proof generation and verification work in more detail and catch up with the above discussion. I think host function is inevitable for verifying XDM on the domain, because the dest domain has to verify the XDM comes from a confirmed block of the src domain, and only the consensus chain have knowledge about whether a domain block (or ER) is confirmed thus domain have to query from the consensus chain when verifying the XDM.
I'm not sure the whole idea of storing multiple MMR roots on the consensus runtime but it feels like we can archive the same by using an old consensus hash (instead of the best hash) to call the consensus runtime API and get the expected MMR root. |
You are making an assumption here that there exists the state for such old hash for every consensus node that is currently importing / doing state transition. This is not required to be true and current
True but we can make this change specifically for MMR offchain rather than relying on state to be available since this would be self-contained unlike dependency on state to be available |
These todos are cleared in the next coming commits
This will be used by the domains to verify MMR proof
… messenger xdm verification.
…ead rely on on Confirmed domain blocks Also fixed some WASM sp-std and missing exclusion of default features
…aud proof verifier to use the extensions
@nazar-pc while tracking MMR roots might be one possible solution, doing such introduces bunch of edge cases, some we already know and some we might not have known yet, and also defeats the purpose of introducing MMR since if we are tracking those roots, then we would rather just track state_roots instead and completely eliminate MMR for XDM. Personally, not super excited about this though. As for the expiration time for XDM, it would still remain the same and we just need to prune the offchain leaves. I initially thought it would be invasive but it actually quite straightforward to prune a leaf and this is self contained would not need to relay on state or block body for any verification. I have updated the PR with review feedback earlier. PTAL and let me know your thoughts ? |
I believe MMR is useful beyond immediate need of XDM. While I also think storing roots in the runtime is preferred comparing to the client (might allow for some interesting proofs in the future), it is not a blocker. Please don't let this PR be blocked by me, @NingLin-P can you take a look here, please? |
Conflicts are not a blocker for PR review. Will fix conflicts when in a moment. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense overall, but the way force pushes were done after initial review with zero explanation (especially the first one) makes it really hard to understand what has actually changed since last time I reviewed earlier version of this PR.
domains/service/Cargo.toml
Outdated
sp-transaction-pool = { version = "4.0.0-dev", git = "https://github.com/subspace/polkadot-sdk", rev = "d6b500960579d73c43fc4ef550b703acfa61c4c8" } | ||
subspace-core-primitives = { version = "0.1.0", path = "../../crates/subspace-core-primitives" } | ||
subspace-runtime-primitives = { version = "0.1.0", path = "../../crates/subspace-runtime-primitives" } | ||
subspace-service = { version = "0.1.0", path = "../../crates/subspace-service" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like this change. This adds a huge subspace-service
to domains-service
significantly reducing compilation parallelism and likely increasing compilation time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is fair. If not this, then I would have to introduce another crate just for Domain specific host functions and extension factory. Did not feel justified enough but if that is preferrable, I can do that instead 👍🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way force pushes were done after initial review with zero explanation (especially the first one) makes it really hard to understand what has actually changed since last time I reviewed earlier version of this PR
Apologies for no explanation. I picked out some commits, specifically first one, in a seperate PR that was already landed in main. Github's diff was all messed up when I initially went with a merge so I ended rebasing it for a cleaner diff while taking the opportunity to address the feedback provided earlier.
domains/service/Cargo.toml
Outdated
sp-transaction-pool = { version = "4.0.0-dev", git = "https://github.com/subspace/polkadot-sdk", rev = "d6b500960579d73c43fc4ef550b703acfa61c4c8" } | ||
subspace-core-primitives = { version = "0.1.0", path = "../../crates/subspace-core-primitives" } | ||
subspace-runtime-primitives = { version = "0.1.0", path = "../../crates/subspace-runtime-primitives" } | ||
subspace-service = { version = "0.1.0", path = "../../crates/subspace-service" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is fair. If not this, then I would have to introduce another crate just for Domain specific host functions and extension factory. Did not feel justified enough but if that is preferrable, I can do that instead 👍🏼
It was typically done in the service file, but I see that you included it in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes make sense, but I'd like to avoid re-exports both in subspace-service
and in domains-service
.
I'm not sure why that is preferable here since this is mono-repo crate and would not have any issues with semvar bumps unlike the external crate but I don't have a strong stance on this. Should be updated now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why that is preferable here since this is mono-repo crate and would not have any issues with semvar bumps unlike the external crate but I don't have a strong stance on this. Should be updated now
That way it is clearer what are explicit dependencies, also public re-exports sometimes result in dependencies that remain even when no longer used simply due to public re-export. IMO it is a good default, the only exception I'm aware of in our repo is the whole libp2p
re-exported from subspace-networking
, but arguably it makes a lot of sense there.
BlockInfo { | ||
block_number: number, | ||
block_hash: hash, | ||
// TODO: Derive correct domain proof |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there tracking for this TODO? please put a link if so. if not - well there should be :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is todo is already resolved as part of the next PR
more here - https://github.com/subspace/subspace/pull/2562/files#diff-f6f9310595b3fedfb829530fe9971b90398313a38566a75f08479f8518db178fL384
LGTM on the surface but we will review XDM once the implementation is stable. |
This is the first of multiple PRs that updates XDM protocol to use MMR proofs for verification. This PR brings more if not all the type changes required for complete migration to MMR proofs. The goal was to introduce new types and add necessary infra to verify and generate the XDM proofs while the Domains are not yet enabled..
This PR also handles the verification part of the XDM but I did not update the tests yet since plan was to do focus on the infra.
Some notes:
sp-messenger
due to cyclic depenedencies and changing that would be too invasive which I'm not quiet excited about yet. Once we have initial changes in, we can think of refactoring if need arises.Overall, recommend going commit by commit for a better reviewing expereince
Removed an existing now redundant storage
Code contributor checklist: