-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add APIs for block pruning manually #1570
Comments
This would be great. I think it is a good UX improvement to allow full nodes to run with all 3 pruning types set, and it should only occupy a more or less constant amount of space on disk. |
We have a downstream issue at Subspace (autonomys/subspace#2114) and might be able to sponsor work on this if someone is interested. |
I'm specifically interested in an API where clients can issue over RPC a 'safe-to-prune' command to advance which blocks are pruned. This would allow reclaiming disk space after indexing, yet guarantees the ability to perform indexing. If this issue is to be focused on header-pruning or a time-based prune (compared to current block count based prune), I'd be happy to open a new issue, yet this sounds close enough it may be best to simply tag in here. (I hadn't prior opened an issue as it seemed easy enough I'd just write an impl myself at one point, yet I'm commenting here now as this sounds like a prerequisite I assumed would already be sufficiently available, but by this issue's existence, isn't.) |
FYI @shamil-gadelshin on Subspace team will be working on this once he is back after holidays. |
My current plan is to introduce a CLI flag similar to state and block pruning. I'm going to remove data from db, in-memory header and header metadata caches. The current fork calculation and pruning depend on the block headers for already pruned blocks so I'm going to add a temporary in-memory cache for "block headers marked for pruning" and prune them with some delay. |
What if node restarts? |
There are several non-exclusive options here:
|
Which code are you talking about?
Both of these options are a no go. |
Shamil starts with implementation of |
I mean I know what the issue is about :P I meant explicitly the stuff about determining the fork and requiring pruned blocks. |
polkadot-sdk/substrate/client/api/src/leaves.rs Lines 157 to 160 in 61be78c
|
Ahh this code :P Yeah, I think we should either not delete/prune anything in this 1 block behind the last finalized block. Or we remove this requirement. But coming up with some new data structure to keep these headers around sounds weird to me. |
Guys, please, have a look when you have time: #3033 |
A friendly ping: #3033 |
This issue has been mentioned on Polkadot Forum. There might be relevant details there: https://forum.polkadot.network/t/block-header-pruning/7198/1 |
This PR changes the fork calculation and pruning algorithm to enable future block header pruning. It's required because the previous algorithm relied on the block header persistence. It follows the [related discussion](#1570) The previous code contained this comment describing the situation: ``` /// Note a block height finalized, displacing all leaves with number less than the finalized /// block's. /// /// Although it would be more technically correct to also prune out leaves at the /// same number as the finalized block, but with different hashes, the current behavior /// is simpler and our assumptions about how finalization works means that those leaves /// will be pruned soon afterwards anyway. pub fn finalize_height(&mut self, number: N) -> FinalizationOutcome<H, N> { ``` The previous algorithm relied on the existing block headers to prune forks later and to enable block header pruning we need to clear all obsolete forks right after the block finalization to not depend on the related block headers in the future. --------- Co-authored-by: Bastian Köcher <[email protected]>
This PR changes the fork calculation and pruning algorithm to enable future block header pruning. It's required because the previous algorithm relied on the block header persistence. It follows the [related discussion](paritytech#1570) The previous code contained this comment describing the situation: ``` /// Note a block height finalized, displacing all leaves with number less than the finalized /// block's. /// /// Although it would be more technically correct to also prune out leaves at the /// same number as the finalized block, but with different hashes, the current behavior /// is simpler and our assumptions about how finalization works means that those leaves /// will be pruned soon afterwards anyway. pub fn finalize_height(&mut self, number: N) -> FinalizationOutcome<H, N> { ``` The previous algorithm relied on the existing block headers to prune forks later and to enable block header pruning we need to clear all obsolete forks right after the block finalization to not depend on the related block headers in the future. --------- Co-authored-by: Bastian Köcher <[email protected]>
This PR changes the fork calculation and pruning algorithm to enable future block header pruning. It's required because the previous algorithm relied on the block header persistence. It follows the [related discussion](paritytech#1570) The previous code contained this comment describing the situation: ``` /// Note a block height finalized, displacing all leaves with number less than the finalized /// block's. /// /// Although it would be more technically correct to also prune out leaves at the /// same number as the finalized block, but with different hashes, the current behavior /// is simpler and our assumptions about how finalization works means that those leaves /// will be pruned soon afterwards anyway. pub fn finalize_height(&mut self, number: N) -> FinalizationOutcome<H, N> { ``` The previous algorithm relied on the existing block headers to prune forks later and to enable block header pruning we need to clear all obsolete forks right after the block finalization to not depend on the related block headers in the future. --------- Co-authored-by: Bastian Köcher <[email protected]>
This PR changes the fork calculation and pruning algorithm to enable future block header pruning. It's required because the previous algorithm relied on the block header persistence. It follows the [related discussion](paritytech#1570) The previous code contained this comment describing the situation: ``` /// Note a block height finalized, displacing all leaves with number less than the finalized /// block's. /// /// Although it would be more technically correct to also prune out leaves at the /// same number as the finalized block, but with different hashes, the current behavior /// is simpler and our assumptions about how finalization works means that those leaves /// will be pruned soon afterwards anyway. pub fn finalize_height(&mut self, number: N) -> FinalizationOutcome<H, N> { ``` The previous algorithm relied on the existing block headers to prune forks later and to enable block header pruning we need to clear all obsolete forks right after the block finalization to not depend on the related block headers in the future. --------- Co-authored-by: Bastian Köcher <[email protected]>
As discussed in paritytech/substrate#14758 there are currently APIs to finalize blocks (
Finalizer::apply_finality()
andFinalizer::finalize_block()
) and blocks pruning is at a fixed offset from that, but we have a use case in Subspace where we want to have that offset dynamic.Essentially we need an API to prune blocks manually and decouple it from block finalization (finalized blocks will be much newer than pruned blocks). The same for state.
While at it having API to prune headers and
--headers-pruning
CLI argument would have been both nice to have as well since currently they are not pruned and not allow for node to occupy bounded amount of space.The text was updated successfully, but these errors were encountered: