Optimize the bytes encoding at node #606

jianoaix · 2024-06-15T00:49:27Z

Why are these changes needed?

When workload is high, the bytes encoding can be quite large (about 50% of actual DB write latency).
This PR avoids unnecessary memory movement when dealing with multiple chunks (on large operators). In addition, this re-implements the encoding in more efficient way.

Benchmark result shows near 4x improvement of performance (the BenchmarkEncodeChunksOld is before this PR, BenchmarkEncodeChunks is this PR):

goos: linux
goarch: amd64
pkg: github.com/Layr-Labs/eigenda/node
cpu: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
BenchmarkEncodeChunksOld-8   	   53523	     21934 ns/op
BenchmarkEncodeChunksOld-8   	   52783	     21848 ns/op
BenchmarkEncodeChunksOld-8   	   52980	     22338 ns/op
BenchmarkEncodeChunksOld-8   	   53157	     22530 ns/op
BenchmarkEncodeChunks-8      	  200438	      5840 ns/op
BenchmarkEncodeChunks-8      	  204920	      5691 ns/op
BenchmarkEncodeChunks-8      	  191730	      5836 ns/op
BenchmarkEncodeChunks-8      	  196705	      5789 ns/op

PASS
ok github.com/Layr-Labs/eigenda/node 10.581s

Checks

I've made sure the lint is passing in this PR.
I've made sure the tests are passing. Note that there might be a few flaky tests, in that case, please comment that they are not relevant.
Testing Strategy
- Unit tests
- Integration tests
- This PR is not tested :(

bxue-l2 · 2024-06-15T18:07:39Z

node/store.go

-	buf := bytes.NewBuffer(make([]byte, 0))
+	totalSize := 0
+	for _, chunk := range chunks {
+		totalSize += len(chunk) + 8 // Add size of uint64 for length


for every blob, all chunks have uniform length, and this is dictated by kzg encoder. For a blob of 128kiB, based on current stake distribution, it will create 4096 chunks. Suppose coding ratio is 8, then every chunk has size 2^17*8/4096=256Byte, if we add 8 bytes to store the length, we are wasting 3% storage. It becomes less of a problem if blob length is larger.

btw, uint32 is sufficient in most case, because it is going to take long time to reach 4Gb chunks

I'm aware of this, but changing encoding should be handled in a compatible way.

bxue-l2

pr itself looks good to me

Optimize the bytes encoding at node

e698a56

jianoaix requested review from bxue-l2 and ian-shim June 15, 2024 00:50

bxue-l2 reviewed Jun 15, 2024

View reviewed changes

bxue-l2 approved these changes Jun 15, 2024

View reviewed changes

jianoaix added 4 commits June 16, 2024 17:48

improve encoding speed

2ee0f66

benchmark

e5f9986

fix

3af187f

fix

6dd9254

jianoaix merged commit 733ebbf into Layr-Labs:master Jun 17, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize the bytes encoding at node #606

Optimize the bytes encoding at node #606

jianoaix commented Jun 15, 2024 •

edited

Loading

bxue-l2 Jun 15, 2024

bxue-l2 Jun 15, 2024

jianoaix Jun 16, 2024

bxue-l2 left a comment

Optimize the bytes encoding at node #606

Optimize the bytes encoding at node #606

Conversation

jianoaix commented Jun 15, 2024 • edited Loading

Why are these changes needed?

Checks

bxue-l2 Jun 15, 2024

Choose a reason for hiding this comment

bxue-l2 Jun 15, 2024

Choose a reason for hiding this comment

jianoaix Jun 16, 2024

Choose a reason for hiding this comment

bxue-l2 left a comment

Choose a reason for hiding this comment

jianoaix commented Jun 15, 2024 •

edited

Loading