Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new framing chunk types without checksums #155

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

derekbruening
Copy link

Adds two new chunk types to the Snappy framing format: compressed data
without a checksum, and uncompressed data without a checksum. These
types are identical to their existing counterparts except they do not
contain a CRC-32C checksum. Essentially, this makes including
checksums for each data chunk optional rather than required.

In some use cases, computing the CRC-32C checksums for the data chunks
in the Snappy framing format ends up dominating execution time.
Eliminating the checksums provides massive 2.5x performance
improvements in our uses of Snappy for compressing address trace data
prior to storing to disk.

Existing readers of the Snappy framing format would be expected to
fail up front on an unknown chunk type when encountering the new
types, until updated to handle them, which should be a simple coding
change.

Adds two new chunk types to the Snappy framing format: compressed data
without a checksum, and uncompressed data without a checksum.  These
types are identical to their existing counterparts except they do not
contain a CRC-32C checksum.  Essentially, this makes including
checksums for each data chunk optional rather than required.

In some use cases, computing the CRC-32C checksums for the data chunks
in the Snappy framing format ends up dominating execution time.
Eliminating the checksums provides massive 2.5x performance
improvements in our uses of Snappy for compressing address trace data
prior to storing to disk.

Existing readers of the Snappy framing format would be expected to
fail up front on an unknown chunk type when encountering the new
types, until updated to handle them, which should be a simple coding
change.
@derekbruening
Copy link
Author

@pwnall PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant