Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the bytes encoding at node #606

Merged
merged 5 commits into from
Jun 17, 2024

Conversation

jianoaix
Copy link
Contributor

@jianoaix jianoaix commented Jun 15, 2024

Why are these changes needed?

When workload is high, the bytes encoding can be quite large (about 50% of actual DB write latency).
This PR avoids unnecessary memory movement when dealing with multiple chunks (on large operators). In addition, this re-implements the encoding in more efficient way.

Benchmark result shows near 4x improvement of performance (the BenchmarkEncodeChunksOld is before this PR, BenchmarkEncodeChunks is this PR):

goos: linux
goarch: amd64
pkg: github.com/Layr-Labs/eigenda/node
cpu: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
BenchmarkEncodeChunksOld-8   	   53523	     21934 ns/op
BenchmarkEncodeChunksOld-8   	   52783	     21848 ns/op
BenchmarkEncodeChunksOld-8   	   52980	     22338 ns/op
BenchmarkEncodeChunksOld-8   	   53157	     22530 ns/op
BenchmarkEncodeChunks-8      	  200438	      5840 ns/op
BenchmarkEncodeChunks-8      	  204920	      5691 ns/op
BenchmarkEncodeChunks-8      	  191730	      5836 ns/op
BenchmarkEncodeChunks-8      	  196705	      5789 ns/op

PASS
ok github.com/Layr-Labs/eigenda/node 10.581s

Checks

  • I've made sure the lint is passing in this PR.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
  • Testing Strategy
    • Unit tests
    • Integration tests
    • This PR is not tested :(

@jianoaix jianoaix requested review from bxue-l2 and ian-shim June 15, 2024 00:50
buf := bytes.NewBuffer(make([]byte, 0))
totalSize := 0
for _, chunk := range chunks {
totalSize += len(chunk) + 8 // Add size of uint64 for length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for every blob, all chunks have uniform length, and this is dictated by kzg encoder. For a blob of 128kiB, based on current stake distribution, it will create 4096 chunks. Suppose coding ratio is 8, then every chunk has size 2^17*8/4096=256Byte, if we add 8 bytes to store the length, we are wasting 3% storage. It becomes less of a problem if blob length is larger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, uint32 is sufficient in most case, because it is going to take long time to reach 4Gb chunks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware of this, but changing encoding should be handled in a compatible way.

Copy link
Contributor

@bxue-l2 bxue-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pr itself looks good to me

@jianoaix jianoaix merged commit 733ebbf into Layr-Labs:master Jun 17, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants