Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify gossip limits #4045

Merged
merged 16 commits into from
Jan 9, 2025
Merged

Clarify gossip limits #4045

merged 16 commits into from
Jan 9, 2025

Conversation

arnetheduck
Copy link
Contributor

@arnetheduck arnetheduck commented Dec 12, 2024

In the gossip specification, the GOSSIP_MAX_SIZE constant is specified for the uncompressed payload size in the gossipsub message. MAX_CHUNK_SIZE serves the same purpose for req/resp and this PR unifies the two constants under MAX_PAYLOAD_SIZE to simplify maintenance.

This PR clarifies how this limit applies to the various fields of the gossipsub message and provides additional limits derived from it that allow clients to more aggressively discard messages.

In particular, clients are allowed to impose more strict limits on topics such as attestation and aggregates - an Attestation for example takes no more than ~485 bytes (to be verified!), far below the 10mb limit, though implicitly clients should already see these limits imposed as rejections by their SSZ decoder - this clarification mainly highlights the possibilty to perform this check earlier in the process.

In the gossip specification, the `GOSSIP_MAX_SIZE` constant is specified
for the uncompressed payload size in the gossipsub message.

This PR clarifies how this limit applies to the various fields of the
gossipsub message and provides additional limits derived from it that
allow clients to more aggressively discard messages.

In particular, clients are allowed to impose more strict limits on
topics such as attestation and aggregates - an `Attestation` for example
takes no more than `~228` bytes (to be verified!), far below the 10mb
limit, though implicitly clients should already see these limits imposed
as rejections by their SSZ decoder - this clarification mainly
highlights the possibilty to perform this check earlier in the process.
Copy link
Member

@jtraglia jtraglia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything here seems reasonable to me 👍

@AgeManning
Copy link
Contributor

I understand from a purely spec perspective the GOSSIP_MAX_SIZE definition is clear for readers and may help pose limits on derived consensus blocks.

However for implementers, it might be nice to specify GOSSIP_MAX_SIZE more intuitively, as the maximum size of the protobuf message we can receive on gossipsub. Because Protobuf initially gives us the varint of the payload size, implementers can then just use this constant to instantly filter gossipsub messages.

With the current definition, which is defined only on the data field, implementers have to work out the total payload size in order to do this initial filtering check (which arguably is the most important). i.e they will have to take this max data size, add the expected size of the other protobuf field elements and also add the protobuf encoding overhead. I imagine this is highly prone to error and will leave clients implementing different total sizes.

I suggest instead, we define the GOSSIP_MAX_SIZE as the total protobuf message size each implementation accepts. Then if we like, we can derive from that, the max data field size, the max consensus block size etc.

This way, all implementations will have the same total upper bound of message size being transmitted in gossipsub messages. It is easily checked, by the first few bytes of the message being sent (the protobuf varint).

@arnetheduck
Copy link
Contributor Author

This way, all implementations will have the same total upper bound of message size being transmitted in gossipsub messages.

I kind of agree that it would be nice and simple to go with the total protobuf size, but it has a few problems:

  • we need to have the same limit on Req/Resp for the payload (so we can send or not send payloads of the same size in both protocols) - the encoding overhead here is different.
  • we need a fixed upper bound on the uncompressed message size is to limit the potential for a decompression bomb
  • the protobuf is actually not just the payload but part of the full gossipsub envelope, which might hold control messages as well - this overhead might change with libp2p versions, so it feels more relevant to specify the limit in terms of what application payload sizes we want to support (ie how big the blocks can be).

I see two strategies we can take:

  • derive a precise upper bound based on the maximum theoretical size of the minimal encoding of each field, for a gossipsub message that holds only this large payload and no other control struff - this PR starts this journey by specifying an upper limit on the data field
  • wing it and say maxlen(data) + 1kb or something like that - certainly is easier to specify.

@AgeManning
Copy link
Contributor

AgeManning commented Dec 16, 2024

Yeah. Agreed.

Perhaps to save everyone time, and to make sure all clients implement the same thing and we are not prone to mistakes due to wild calculations, we specify a few constants:

  • Max Uncompressed Block Size
  • Max Compressed Block Size
  • Max Gossipsub protobuf message (per topic, those that don't implement per topic, can take the max of the list)
  • Max RPC Chunk Size

Then each implementor doesn't have to make up their own calculations + fudge factors and we don't get weird mismatch of message transmissions when we approach these limits?

I don't mind adding some fudge factor/buffer (like the 1kb you suggested) to handle the max sizes, its easy and everyone can implement it.

@rolfyone
Copy link
Contributor

I like the idea of having all of these well defined so they're consistent.

Each gossipsub [message](https://github.com/libp2p/go-libp2p-pubsub/blob/master/pb/rpc.proto#L17-L24) has a maximum size of `GOSSIP_MAX_SIZE`.
Clients MUST reject (fail validation) messages that are over this size limit.
Likewise, clients MUST NOT emit or propagate messages larger than this limit.
The uncompressed payload in the [`data`](https://github.com/libp2p/go-libp2p-pubsub/blob/c06df2f9a38e9382e644b241adf0e96e5ca00955/pb/rpc.proto#L19)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this same as the current scenario for many clients? uncompressed limit 10MB (lodestar for e.g.) and this means for 60m gas uncompressed may have >=10MB with zero call data tx

we need bump to 15MB no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need bump to 15MB no?

the current idea is to work on a better long-term solution instead and rely on the pectra repricing in the shorter term - the 15mb limit has several downsides.

@arnetheduck
Copy link
Contributor Author

Perhaps to save everyone time,

022bb22 adds functions for computing these derived limits as well as the suggested 1kb fudge factor, in anticipation of coming up with a better way of limiting the maximimum block size

Something to consider for the future is that the topic takes space from the fudge factor - we must be careful not to create massive topis or we'll exceed this fudge factor. We don't have a common fixed upper limit on topic lengths - instead, clients are expected to reject unknown topics which effectively limits their upper-bound size.

Max RPC Chunk Size

RPC already has strict fixed upper limits on everything - we can be a lot more strict there because we use chunked snappy encoding which limits size of any one "RPC frame" to 64kb + some small constant and there is no "additonal" framing to consider (ie no mixing with control messages and whatnot).

We also already had in place the max compressed and max uncompressed sizes based on the same "worst case snappy" encoding - I've updated the spec to use the same functions and constants for both, for clarity.

@arnetheduck
Copy link
Contributor Author

linter failures look unrelated - help! cc @jtraglia @hwwhww

@jtraglia
Copy link
Member

linter failures look unrelated - help! cc @jtraglia @hwwhww

Hey @arnetheduck 👋 I think you just need to pull in changes from dev.

There should be a button like this near the bottom of the page:

image

I would do it myself, but this PR has the "maintainers allowed to change things" option unchecked.

@jtraglia
Copy link
Member

Hmm it's still failing. I'll look into it. Sorry about this.

Could you allow me to edit the PR branch? This checkbox on this PR:

image

@arnetheduck
Copy link
Contributor Author

would if I could ;) Invited you for write access instead.

@jtraglia
Copy link
Member

It should be good now 👍 I initially thought it was a caching issue but it turns out that new functions (max_compressed_len and max_message_size) are included in the pyspec and they had some linting issues. I fixed their indentations & type annotations. And then I ran into a caching issue and had to bump the cache key anyway 😅

Comment on lines +253 to +255
def max_message_size() -> uint64:
# Allow 1024 bytes for framing and encoding overhead but at least 1MiB in case MAX_PAYLOAD_SIZE is small.
return max(max_compressed_len(MAX_PAYLOAD_SIZE) + 1024, 1024 * 1024)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this constant is used to limit gossip gRPC message size only, may be it makes sense to give it a more explicit name?
Also maybe we could reserve more space (e.g. 1Mb instead of 1Kb) to accommodate all possible Gossip control message which could potentially be piggybacked to the publish message. That would still maintain a reasonable limit but prevent gossip implementation from corner message size cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe we could reserve more space

I don't see why, really - just send the control message in a separate frame which is trivial - basically, if you want to piggyback control messages, you can add real messages and control messages in a loop until you hit the limit and break off there - this gracefully deals with any kind of packing. You need a bounded strategy like this anyway, and the minimum max size ensures that we don't send too many small messages either).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. It would help to avoid edge cases with a single message of max size, but we would hit the same problem when there are e.g. 2 message (or more) which have the cumulative size around max_message_size()

Copy link
Member

@ppopth ppopth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better for me

@ppopth
Copy link
Member

ppopth commented Dec 19, 2024

The Bellatrix spec at specs/bellatrix/p2p-interface.md also mentioned GOSSIP_MAX_SIZE. Should that be edited as well?

@arnetheduck
Copy link
Contributor Author

The Bellatrix spec

44cecd2

@arnetheduck
Copy link
Contributor Author

@jtraglia thanks for the help - I think we're gtg here!

@jtraglia jtraglia mentioned this pull request Jan 8, 2025
6 tasks
Comment on lines 1724 to 1725
The limit is set on the uncompressed payload size in particular to protect against decompression bombs - although

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incomplete sentence. Was there something else you wanted to say here?

Copy link
Member

@jtraglia jtraglia Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to remove this sentence since the next section dives into this. Also, I don't believe decompression bombs are practical with snappy, as there is a known, relatively low maximum compression ratio.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, not sure what I wanted to say, but removing -although would be enough I think.

from what I remember decompression bombs are still possible - "max compression" is based on an honest compressor (and used mainly to allocate a compression buffer, in "normal" snappy usage) - at least, that's how I remember the reasoning at the time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Added back the remark without "- although" 👍

Copy link
Member

@jtraglia jtraglia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnetheduck, please take a look at my recent changes. If they're fine with you, this PR is ready to be merged.

@jtraglia jtraglia merged commit da17461 into ethereum:dev Jan 9, 2025
23 checks passed
@jtraglia jtraglia deleted the max-gossip-limit branch January 9, 2025 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants