Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify gossip limits #4045

Merged
merged 16 commits into from
Jan 9, 2025
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,13 @@ commands:
description: "Restore the cache with pyspec keys"
steps:
- restore_cached_venv:
venv_name: v30-pyspec
venv_name: v32-pyspec
reqs_checksum: cache-{{ checksum "setup.py" }}-{{ checksum "requirements_preinstallation.txt" }}
save_pyspec_cached_venv:
description: Save a venv into a cache with pyspec keys"
steps:
- save_cached_venv:
venv_name: v30-pyspec
venv_name: v32-pyspec
reqs_checksum: cache-{{ checksum "setup.py" }}-{{ checksum "requirements_preinstallation.txt" }}
venv_path: ./venv
jobs:
Expand Down
76 changes: 63 additions & 13 deletions specs/phase0/p2p-interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@
- [Constants](#constants)
- [Configuration](#configuration)
- [MetaData](#metadata)
- [Maximum message sizes](#maximum-message-sizes)
- [`max_compressed_len`](#max_compressed_len)
- [`max_message_size`](#max_message_size)
- [The gossip domain: gossipsub](#the-gossip-domain-gossipsub)
- [Topics and messages](#topics-and-messages)
- [Global topics](#global-topics)
Expand All @@ -28,6 +31,7 @@
- [`beacon_attestation_{subnet_id}`](#beacon_attestation_subnet_id)
- [Attestations and Aggregation](#attestations-and-aggregation)
- [Encodings](#encodings)
- [Gossipsub size limits](#gossipsub-size-limits)
- [The Req/Resp domain](#the-reqresp-domain)
- [Protocol identification](#protocol-identification)
- [Req/Resp interaction](#reqresp-interaction)
Expand Down Expand Up @@ -102,6 +106,8 @@
- [Why are we using Snappy for compression?](#why-are-we-using-snappy-for-compression)
- [Can I get access to unencrypted bytes on the wire for debugging purposes?](#can-i-get-access-to-unencrypted-bytes-on-the-wire-for-debugging-purposes)
- [What are SSZ type size bounds?](#what-are-ssz-type-size-bounds)
- [Why is the message size defined in terms of application payload?](#why-is-the-message-size-defined-in-terms-of-application-payload)
- [Why is there a limit on message sizes at all?](#why-is-there-a-limit-on-message-sizes-at-all)
- [libp2p implementations matrix](#libp2p-implementations-matrix)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->
Expand Down Expand Up @@ -193,11 +199,10 @@ This section outlines configurations that are used in this spec.

| Name | Value | Description |
|---|---|---|
| `GOSSIP_MAX_SIZE` | `10 * 2**20` (= 10485760, 10 MiB) | The maximum allowed size of uncompressed gossip messages. |
| `MAX_PAYLOAD_SIZE` | `10 * 2**20` (= 10485760, 10 MiB) | The maximum allowed size of uncompressed payload in gossipsub messages / RPC chunks. |
| `MAX_REQUEST_BLOCKS` | `2**10` (= 1024) | Maximum number of blocks in a single request |
| `EPOCHS_PER_SUBNET_SUBSCRIPTION` | `2**8` (= 256) | Number of epochs on a subnet subscription (~27 hours) |
| `MIN_EPOCHS_FOR_BLOCK_REQUESTS` | `MIN_VALIDATOR_WITHDRAWABILITY_DELAY + CHURN_LIMIT_QUOTIENT // 2` (= 33024, ~5 months) | The minimum epoch range over which a node must serve blocks |
| `MAX_CHUNK_SIZE` | `10 * 2**20` (=10485760, 10 MiB) | The maximum allowed size of uncompressed req/resp chunked responses. |
| `ATTESTATION_PROPAGATION_SLOT_RANGE` | `32` | The maximum number of slots during which an attestation can be propagated. |
| `MAXIMUM_GOSSIP_CLOCK_DISPARITY` | `500` | The maximum **milliseconds** of clock disparity assumed between honest nodes. |
| `MESSAGE_DOMAIN_INVALID_SNAPPY` | `DomainType('0x00000000')` | 4-byte domain for gossip message-id isolation of *invalid* snappy messages |
Expand Down Expand Up @@ -229,6 +234,27 @@ Where
is entirely independent of the ENR sequence number,
and will in most cases be out of sync with the ENR sequence number.

### Maximum message sizes

Maximum message sizes are derived from the maximum payload size that the network can carry according to the following functions:

#### `max_compressed_len`

```python
def max_compressed_len(n: uint64) -> uint64:
# Worst-case compressed length for a given payload of size n when using snappy
# https://github.com/google/snappy/blob/32ded457c0b1fe78ceb8397632c416568d6714a0/snappy.cc#L218C1-L218C47
return uint64(32 + n + n / 6)
```

#### `max_message_size`

```python
def max_message_size() -> uint64:
# Allow 1024 bytes for framing and encoding overhead but at least 1MiB in case MAX_PAYLOAD_SIZE is small.
return max(max_compressed_len(MAX_PAYLOAD_SIZE) + 1024, 1024 * 1024)
Comment on lines +253 to +255
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this constant is used to limit gossip gRPC message size only, may be it makes sense to give it a more explicit name?
Also maybe we could reserve more space (e.g. 1Mb instead of 1Kb) to accommodate all possible Gossip control message which could potentially be piggybacked to the publish message. That would still maintain a reasonable limit but prevent gossip implementation from corner message size cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also maybe we could reserve more space

I don't see why, really - just send the control message in a separate frame which is trivial - basically, if you want to piggyback control messages, you can add real messages and control messages in a loop until you hit the limit and break off there - this gracefully deals with any kind of packing. You need a bounded strategy like this anyway, and the minimum max size ensures that we don't send too many small messages either).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. It would help to avoid edge cases with a single message of max size, but we would hit the same problem when there are e.g. 2 message (or more) which have the cumulative size around max_message_size()

```

### The gossip domain: gossipsub

Clients MUST support the [gossipsub v1](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md) libp2p Protocol
Expand Down Expand Up @@ -268,13 +294,11 @@ This defines both the type of data being sent on the topic and how the data fiel
- `Encoding` - the encoding strategy describes a specific representation of bytes that will be transmitted over the wire.
See the [Encodings](#Encodings) section for further details.

Clients MUST reject messages with unknown topic.

*Note*: `ForkDigestValue` is composed of values that are not known until the genesis block/state are available.
Due to this, clients SHOULD NOT subscribe to gossipsub topics until these genesis values are known.

Each gossipsub [message](https://github.com/libp2p/go-libp2p-pubsub/blob/master/pb/rpc.proto#L17-L24) has a maximum size of `GOSSIP_MAX_SIZE`.
Clients MUST reject (fail validation) messages that are over this size limit.
Likewise, clients MUST NOT emit or propagate messages larger than this limit.

The optional `from` (1), `seqno` (3), `signature` (5) and `key` (6) protobuf fields are omitted from the message,
since messages are identified by content, anonymous, and signed where necessary in the application layer.
Starting from Gossipsub v1.1, clients MUST enforce this by applying the `StrictNoSign`
Expand All @@ -288,6 +312,8 @@ The `message-id` of a gossipsub message MUST be the following 20 byte value comp
the concatenation of `MESSAGE_DOMAIN_INVALID_SNAPPY` with the raw message data,
i.e. `SHA256(MESSAGE_DOMAIN_INVALID_SNAPPY + message.data)[:20]`.

Where relevant, clients MUST reject messages with `message-id` sizes other than 20 bytes.

*Note*: The above logic handles two exceptional cases:
(1) multiple snappy `data` can decompress to the same value,
and (2) some message `data` can fail to snappy decompress altogether.
Expand Down Expand Up @@ -502,6 +528,16 @@ so [basic snappy block compression](https://github.com/google/snappy/blob/master
Implementations MUST use a single encoding for gossip.
Changing an encoding will require coordination between participating implementations.

#### Gossipsub size limits

Size limits are placed both on the [`RPCMsg`](https://github.com/libp2p/specs/blob/b5f7fce29b32d4c7d0efe37b019936a11e5db872/pubsub/README.md#the-rpc) frame as well as the encoded payload in each [`Message`](https://github.com/libp2p/specs/blob/b5f7fce29b32d4c7d0efe37b019936a11e5db872/pubsub/README.md#the-message).

Clients MUST reject and MUST NOT emit or propagate messages whose size exceed the following limits:

* the size of the encoded `RPCMsg`, including control messages, framing, topics etc, must not exceed `max_message_size()`
* the size of the compressed payload in the `Message.data` field must not exceed `max_compressed_len(MAX_PAYLOAD_SIZE)`.
* the size of the uncompressed payload must not exceed `MAX_PAYLOAD_SIZE` or the [type-specific SSZ bound](#what-are-ssz-type-size-bounds), whichever is lower.

### The Req/Resp domain

#### Protocol identification
Expand Down Expand Up @@ -551,7 +587,7 @@ All other response types (non-Lists) send a single `response_chunk`.
For both `request`s and `response`s, the `encoding-dependent-header` MUST be valid,
and the `encoded-payload` must be valid within the constraints of the `encoding-dependent-header`.
This includes type-specific bounds on payload size for some encoding strategies.
Regardless of these type specific bounds, a global maximum uncompressed byte size of `MAX_CHUNK_SIZE` MUST be applied to all method response chunks.
Regardless of these type specific bounds, a global maximum uncompressed byte size of `MAX_PAYLOAD_SIZE` MUST be applied to all method response chunks.

Clients MUST ensure that lengths are within these bounds; if not, they SHOULD reset the stream immediately.
Clients tracking peer reputation MAY decrement the score of the misbehaving peer under this circumstance.
Expand Down Expand Up @@ -665,15 +701,13 @@ When snappy is applied, it can be passed through a buffered Snappy reader to dec

Before reading the payload, the header MUST be validated:
- The unsigned protobuf varint used for the length-prefix MUST not be longer than 10 bytes, which is sufficient for any `uint64`.
- The length-prefix is within the expected [size bounds derived from the payload SSZ type](#what-are-ssz-type-size-bounds).
- The length-prefix is within the expected [size bounds derived from the payload SSZ type](#what-are-ssz-type-size-bounds) or `MAX_PAYLOAD_SIZE`, whichever is smaller.

After reading a valid header, the payload MAY be read, while maintaining the size constraints from the header.

A reader SHOULD NOT read more than `max_encoded_len(n)` bytes after reading the SSZ length-prefix `n` from the header.
- For `ssz_snappy` this is: `32 + n + n // 6`.
This is considered the [worst-case compression result](https://github.com/google/snappy/blob/537f4ad6240e586970fe554614542e9717df7902/snappy.cc#L98) by Snappy.
A reader MUST NOT read more than `max_compressed_len(n)` bytes after reading the SSZ length-prefix `n` from the header.

A reader SHOULD consider the following cases as invalid input:
A reader MUST consider the following cases as invalid input:
- Any remaining bytes, after having read the `n` SSZ bytes. An EOF is expected if more bytes are read than required.
- An early EOF, before fully reading the declared length-prefix worth of SSZ bytes.

Expand Down Expand Up @@ -1430,7 +1464,7 @@ Nevertheless, in the case of `ssz_snappy`, messages are still length-prefixed wi
* Alignment with protocols like gRPC over HTTP/2 that prefix with length
* Sanity checking of message length, and enabling much stricter message length limiting based on SSZ type information,
to provide even more DOS protection than the global message length already does.
E.g. a small `Status` message does not nearly require `MAX_CHUNK_SIZE` bytes.
E.g. a small `Status` message does not nearly require `MAX_PAYLOAD_SIZE` bytes.

[Protobuf varint](https://developers.google.com/protocol-buffers/docs/encoding#varints) is an efficient technique to encode variable-length (unsigned here) ints.
Instead of reserving a fixed-size field of as many bytes as necessary to convey the maximum possible value, this field is elastic in exchange for 1-bit overhead per byte.
Expand Down Expand Up @@ -1679,6 +1713,22 @@ Other types are static, they have a fixed size: no dynamic-length content is inv
For reference, the type bounds can be computed ahead of time, [as per this example](https://gist.github.com/protolambda/db75c7faa1e94f2464787a480e5d613e).
It is advisable to derive these lengths from the SSZ type definitions in use, to ensure that version changes do not cause out-of-sync type bounds.

#### Why is the message size defined in terms of application payload?

When transmitting messages over gossipsub and / or req/resp, we want to ensure that the same payload sizes are supported no matter the underlying transport, decoupling the consensus layer from libp2p-induced overhead and the particular transmission strategy.

To derive "encoded size limits" from desired application sizes we take into account snappy compression and framing overhead.

In the case of gossipsub, the protocol supports sending multiple application payloads as well as mixing application data with control messages in each gossipsub frame - the limit is set such that at least one max-sized application-level message together with a small amount (1kb) of gossipsub overhead is allowed - implementations are free to pack multiple smaller application messages into a single gossipsub frame, and / or combine it with control messages as they see fit.

The limit is set on the uncompressed payload size in particular to protect against decompression bombs - although

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incomplete sentence. Was there something else you wanted to say here?

Copy link
Member

@jtraglia jtraglia Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to remove this sentence since the next section dives into this. Also, I don't believe decompression bombs are practical with snappy, as there is a known, relatively low maximum compression ratio.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, not sure what I wanted to say, but removing -although would be enough I think.

from what I remember decompression bombs are still possible - "max compression" is based on an honest compressor (and used mainly to allocate a compression buffer, in "normal" snappy usage) - at least, that's how I remember the reasoning at the time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Added back the remark without "- although" 👍

#### Why is there a limit on message sizes at all?

The message size limit protects against several forms of DoS and network-based amplification attacks and provide upper bounds for resource (network, memory) usage in the client based on protocol requirements to decode, buffer, cache, store and re-transmit messages which in turn translate into performance and protection tradeoffs, ensuring capacity to handle worst cases during recovery from network instability.

In particular, blocks which at the time of writing is the only message type without a practical SSZ-derived upper bound on size cannot be fully verified synchronously as part of gossipsub validity checks meaning that there exist cases where invalid messages signed by a validator may be amplified by the network.

## libp2p implementations matrix

This section will soon contain a matrix showing the maturity/state of the libp2p features required
Expand Down