-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: teach ALPArray to store validity only in the encoded array #1951
Draft
danking
wants to merge
1
commit into
develop
Choose a base branch
from
dk/alp-validity-in-encoded-only
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+164
−55
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The patches are now always non-nullable. This required PrimitiveArray::patch to gracefully handle non-nullable patches when the array is nullable. I modified the benchmarks to include patch manipulation time, but notice that the test data has no patches. The benchmarks measure the overhead of `is_valid`. If we had test data where the invalid positions contained exceptional values, I would expect a modest improvement in both decompression and compression time. As discussed [in slack](https://spiraldb.slack.com/archives/C07BV3GKAJ2/p1736894376100079), the `is_valid` is expensive for two reasons: (a) creation of a Scalar and (b) conversion to an Arrow BooleanBuffer. @gatesn is working on a FilterMask change that should subsume all our boolean compression strategies into a new "BoolMask". With that in place, we can require Validity::Array to hold a "BoolMask" and thus check validity without either of the issues above. This PR ------- ``` Timer precision: 41 ns alp_compress fastest │ slowest │ median │ mean │ samples │ iters ├─ compress_alp │ │ │ │ │ │ ├─ f32 │ │ │ │ │ │ │ ├─ (100000, 0.25) 170.2 µs │ 442.1 µs │ 170.7 µs │ 173.6 µs │ 100 │ 100 │ │ ├─ (100000, 0.95) 169.3 µs │ 202.2 µs │ 170.4 µs │ 171.7 µs │ 100 │ 100 │ │ ├─ (100000, 1.0) 136.3 µs │ 147.1 µs │ 136.9 µs │ 137.2 µs │ 100 │ 100 │ │ ├─ (10000000, 0.25) 13.6 ms │ 16.46 ms │ 14.39 ms │ 14.23 ms │ 100 │ 100 │ │ ├─ (10000000, 0.95) 13.64 ms │ 17.7 ms │ 14.44 ms │ 14.38 ms │ 100 │ 100 │ │ ╰─ (10000000, 1.0) 13.55 ms │ 14.97 ms │ 14.35 ms │ 14.23 ms │ 100 │ 100 │ ╰─ f64 │ │ │ │ │ │ ├─ (100000, 0.25) 240.7 µs │ 385.7 µs │ 247.8 µs │ 249.1 µs │ 100 │ 100 │ ├─ (100000, 0.95) 240.2 µs │ 253.2 µs │ 243.5 µs │ 244.4 µs │ 100 │ 100 │ ├─ (100000, 1.0) 172.6 µs │ 184.4 µs │ 175.2 µs │ 175.5 µs │ 100 │ 100 │ ├─ (10000000, 0.25) 15.95 ms │ 24.24 ms │ 16.61 ms │ 17.25 ms │ 100 │ 100 │ ├─ (10000000, 0.95) 15.95 ms │ 21.34 ms │ 16.39 ms │ 16.85 ms │ 100 │ 100 │ ╰─ (10000000, 1.0) 15.92 ms │ 23.41 ms │ 16.46 ms │ 17.04 ms │ 100 │ 100 ╰─ decompress_alp │ │ │ │ │ ├─ f32 │ │ │ │ │ │ ├─ (100000, 0.25) 12.2 µs │ 34.7 µs │ 12.29 µs │ 12.52 µs │ 100 │ 100 │ ├─ (100000, 0.95) 12.12 µs │ 12.74 µs │ 12.35 µs │ 12.37 µs │ 100 │ 100 │ ├─ (100000, 1.0) 12.16 µs │ 12.95 µs │ 12.37 µs │ 12.4 µs │ 100 │ 100 │ ├─ (10000000, 0.25) 2.117 ms │ 4.544 ms │ 2.637 ms │ 2.674 ms │ 100 │ 100 │ ├─ (10000000, 0.95) 2.085 ms │ 4.458 ms │ 2.362 ms │ 2.504 ms │ 100 │ 100 │ ╰─ (10000000, 1.0) 2.097 ms │ 3.875 ms │ 2.229 ms │ 2.338 ms │ 100 │ 100 ╰─ f64 │ │ │ │ │ ├─ (100000, 0.25) 23.41 µs │ 25.16 µs │ 23.66 µs │ 23.68 µs │ 100 │ 100 ├─ (100000, 0.95) 23.2 µs │ 24.7 µs │ 24.06 µs │ 24.05 µs │ 100 │ 100 ├─ (100000, 1.0) 22.79 µs │ 26.08 µs │ 22.91 µs │ 22.95 µs │ 100 │ 100 ├─ (10000000, 0.25) 4.216 ms │ 6.862 ms │ 4.416 ms │ 4.568 ms │ 100 │ 100 ├─ (10000000, 0.95) 4.242 ms │ 7.647 ms │ 4.59 ms │ 4.827 ms │ 100 │ 100 ╰─ (10000000, 1.0) 4.236 ms │ 8.129 ms │ 4.377 ms │ 4.507 ms │ 100 │ 100 ``` Develop ------- I patched develop with this PR's benchmark code ``` Timer precision: 41 ns alp_compress fastest │ slowest │ median │ mean │ samples │ iters ├─ compress_alp │ │ │ │ │ │ ├─ f32 │ │ │ │ │ │ │ ├─ (100000, 0.25) 136 µs │ 279.7 µs │ 136.8 µs │ 138.5 µs │ 100 │ 100 │ │ ├─ (100000, 0.95) 136.2 µs │ 149.7 µs │ 136.9 µs │ 137.6 µs │ 100 │ 100 │ │ ├─ (100000, 1.0) 136 µs │ 148.5 µs │ 136.6 µs │ 137.1 µs │ 100 │ 100 │ │ ├─ (10000000, 0.25) 13.68 ms │ 15.06 ms │ 14.07 ms │ 14.14 ms │ 100 │ 100 │ │ ├─ (10000000, 0.95) 13.67 ms │ 19.4 ms │ 14.05 ms │ 14.18 ms │ 100 │ 100 │ │ ╰─ (10000000, 1.0) 13.66 ms │ 14.73 ms │ 13.87 ms │ 14.04 ms │ 100 │ 100 │ ╰─ f64 │ │ │ │ │ │ ├─ (100000, 0.25) 167.7 µs │ 301.4 µs │ 172.7 µs │ 173.2 µs │ 100 │ 100 │ ├─ (100000, 0.95) 167.7 µs │ 187.7 µs │ 170.7 µs │ 171.1 µs │ 100 │ 100 │ ├─ (100000, 1.0) 167.6 µs │ 183.7 µs │ 170.9 µs │ 171.1 µs │ 100 │ 100 │ ├─ (10000000, 0.25) 15.54 ms │ 21.9 ms │ 15.92 ms │ 16.05 ms │ 100 │ 100 │ ├─ (10000000, 0.95) 15.61 ms │ 16.74 ms │ 15.97 ms │ 16.03 ms │ 100 │ 100 │ ╰─ (10000000, 1.0) 15.64 ms │ 18.85 ms │ 16.1 ms │ 16.23 ms │ 100 │ 100 ╰─ decompress_alp │ │ │ │ │ ├─ f32 │ │ │ │ │ │ ├─ (100000, 0.25) 12.37 µs │ 85.49 µs │ 12.49 µs │ 13.22 µs │ 100 │ 100 │ ├─ (100000, 0.95) 12.29 µs │ 12.74 µs │ 12.43 µs │ 12.44 µs │ 100 │ 100 │ ├─ (100000, 1.0) 11.7 µs │ 11.95 µs │ 11.83 µs │ 11.81 µs │ 100 │ 100 │ ├─ (10000000, 0.25) 2.081 ms │ 3.003 ms │ 2.175 ms │ 2.263 ms │ 100 │ 100 │ ├─ (10000000, 0.95) 2.082 ms │ 2.228 ms │ 2.109 ms │ 2.124 ms │ 100 │ 100 │ ╰─ (10000000, 1.0) 2.082 ms │ 2.904 ms │ 2.13 ms │ 2.202 ms │ 100 │ 100 ╰─ f64 │ │ │ │ │ ├─ (100000, 0.25) 23.66 µs │ 25.66 µs │ 23.83 µs │ 23.86 µs │ 100 │ 100 ├─ (100000, 0.95) 23.16 µs │ 24.62 µs │ 24.04 µs │ 23.89 µs │ 100 │ 100 ├─ (100000, 1.0) 22.87 µs │ 23.49 µs │ 23.04 µs │ 23.03 µs │ 100 │ 100 ├─ (10000000, 0.25) 4.221 ms │ 5.59 ms │ 4.326 ms │ 4.469 ms │ 100 │ 100 ├─ (10000000, 0.95) 4.242 ms │ 5.319 ms │ 4.545 ms │ 4.536 ms │ 100 │ 100 ╰─ (10000000, 1.0) 4.228 ms │ 5.652 ms │ 4.342 ms │ 4.519 ms │ 100 │ 100 ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The patches are now always non-nullable.
This required PrimitiveArray::patch to gracefully handle non-nullable patches when the array is nullable.
I modified the benchmarks to include patch manipulation time, but notice that the test data has no patches. The benchmarks measure the overhead of
is_valid
. If we had test data where the invalid positions contained exceptional values, I would expect a modest improvement in both decompression and compression time.As discussed in slack, the
is_valid
is expensive for two reasons: (a) creation of a Scalar and (b) conversion to an Arrow BooleanBuffer. @gatesn is working on a FilterMask change that should subsume all our boolean compression strategies into a new "BoolMask". With that in place, we can require Validity::Array to hold a "BoolMask" and thus check validity without either of the issues above.This PR
The parameters are: (number_of_array_elements, fraction_valid).
Develop
I patched develop with this PR's benchmark code