Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: support non-Primitive encodings for views #1123

Merged
merged 2 commits into from
Oct 23, 2024
Merged

Conversation

a10y
Copy link
Contributor

@a10y a10y commented Oct 23, 2024

Fix issue that would pop up when canonicalizing a Null ConstantArray into VarBinView. We assume that we can immediately slice the views array. Now we handle the case where the views array is compressed (in this case, as a ConstantArray(0u8)).

I want to measure impact on TPC-H but I don't expect it to be significant

Comment on lines -281 to -286
slice::from_raw_parts(
PrimitiveArray::try_from(self.views())
.vortex_expect("Views must be a primitive array")
.maybe_null_slice::<u8>()
.as_ptr() as _,
self.views().len() / VIEW_SIZE_BYTES,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer safe to do since the views can be ConstantArray. This method was mainly used

  1. As an iterator for ArrayAccessor or chunked array packing
  2. As a point lookup for bytes_at

Instead we split this out into two methods. One that canonicalizes and wraps the Buffer into an iterator, and the other which slices the views buffer to access individual elements

@a10y
Copy link
Contributor Author

a10y commented Oct 23, 2024

I've confirmed this passes compress_noci locally so that is nice.

However, this leads to a pretty substantial perf regression on TPC-H:

image

Wanna fix that before merging

@robert3005
Copy link
Member

Eh, the compiler really optimizes slices (and iteration over them) so anything that doesn't lower to a slice [...] will be slower

@a10y
Copy link
Contributor Author

a10y commented Oct 23, 2024

I continue to have wild swings in the benchmark results on my laptop. For example when I run q19 I continue to get anywhere from 230ms to 300ms, though mode is hovering around the 230ms which would not be a regression

@a10y a10y marked this pull request as ready for review October 23, 2024 18:20
@a10y a10y added the benchmark Run benchmarks on this branch label Oct 23, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Oct 23, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex bytes_at

Benchmark suite Current: 4dff5e9 Previous: 3b1079d Ratio
bytes_at/array_data 708.2584310756173 ns (1.3622408435577427) 717.974097777377 ns (1.2064181420049067) 0.99
bytes_at/array_view 491.99578596667857 ns (2.183868599015028) 196.86009549623378 ns (0.3268893904277803) 2.50

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataFusion

Benchmark suite Current: 4dff5e9 Previous: 3b1079d Ratio
arrow/planning 777959.4289029486 ns (2419.9151670023566) 790221.5591252199 ns (1515.1132800689666) 0.98
arrow/exec 1388213.6928882462 ns (6014.305378110963) 1379330.4080514824 ns (7232.249170157942) 1.01
vortex-pushdown-compressed/planning 484633.30248162406 ns (1246.1335084596649) 490919.0073508655 ns (1464.4233101196005) 0.99
vortex-pushdown-compressed/exec 2622337.8010526304 ns (12157.105532894842) 2590935.5585 ns (15390.568837500177) 1.01
vortex-pushdown-uncompressed/planning 486161.28049620317 ns (1047.238953312626) 488103.04912544315 ns (1212.0697022331588) 1.00
vortex-pushdown-uncompressed/exec 2555396.430500001 ns (5724.577356250491) 2600917.4195000003 ns (5384.988850000082) 0.98
vortex-nopushdown-compressed/planning 798827.5596032212 ns (2579.4166713834857) 803374.997639911 ns (2408.809872436512) 0.99
vortex-nopushdown-compressed/exec 3006773.4235294135 ns (24751.86795588187) 3024607.9147058814 ns (33552.456676470116) 0.99
vortex-nopushdown-uncompressed/planning 793480.6676030105 ns (1555.1201903469628) 808596.3523564313 ns (3737.396927827387) 0.98
vortex-nopushdown-uncompressed/exec 6664878.39875 ns (152004.43776562484) 4742324.754545457 ns (37411.49940909026) 1.41

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random Access

Benchmark suite Current: 4dff5e9 Previous: 3b1079d Ratio
random-access/vortex-tokio-local-disk 1022545.1482356378 ns (6535.874711911834) 997049.2092517357 ns (6893.740383354074) 1.03
random-access/vortex-local-fs 1127930.7084399557 ns (8080.158684606082) 1125890.171536007 ns (7411.739631096018) 1.00
random-access/parquet-tokio-local-disk 238319905.1 ns (3166642.7970833033) 245102986.73333335 ns (5289005.529583305) 0.97

This comment was automatically generated by workflow using github-action-benchmark.

@robert3005
Copy link
Member

You could get yourself two methods that check array encoding. I think this should be equivalent to what we had before but maybe the compiler is not quite capable

@a10y
Copy link
Contributor Author

a10y commented Oct 23, 2024

@robert3005 not sure I follow

@robert3005
Copy link
Member

I’m saying on top of existing method you could have an unsafe method that did what was happening previously and the caller would have to check the encoding. I think it really only matters in canonicalization?

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TPC-H

Benchmark suite Current: 4dff5e9 Previous: e8a2bf5 Ratio
tpch_q1/vortex-in-memory-no-pushdown 566335685.4 ns (2264038.131250024) 496858193.7 ns (2602974.7524999976) 1.14
tpch_q1/vortex-in-memory-pushdown 452734028.8 ns (1431873.0287500024) 512399728.1 ns (3558853.950000018) 0.88
tpch_q1/arrow 550938310.2 ns (1740945.71875) 474851458.9 ns (2075509.6962499917) 1.16
tpch_q1/parquet 700125826.5 ns (3052222.1437500715) 697232133.8 ns (2317283.3712500334) 1.00
tpch_q1/vortex-file-compressed 515667475.7 ns (3311516.2475000024) 572696042 ns (2695527.8499999642) 0.90
tpch_q1/vortex-file-uncompressed 529599273.9 ns (2289386.150000006) 517758970.7 ns (2101478.2962499857) 1.02
tpch_q2/vortex-in-memory-no-pushdown 119266492.87436509 ns (1011342.8644355312) 122931648.83579364 ns (1184321.496663697) 0.97
tpch_q2/vortex-in-memory-pushdown 118327746.9370635 ns (752083.4700139016) 122998920.91892858 ns (2056271.7281250134) 0.96
tpch_q2/arrow 116612715.68043652 ns (741615.0678174719) 121609330.84353176 ns (1179981.5959806591) 0.96
tpch_q2/parquet 149044322.59162697 ns (1229332.109722212) 154167573.30035716 ns (833637.984732151) 0.97
tpch_q2/vortex-file-compressed 164381253.5577381 ns (1186064.9234300554) 181420927.65884918 ns (2230260.9308953434) 0.91
tpch_q2/vortex-file-uncompressed 163401177.88940477 ns (719518.7253125161) 179982923.24134922 ns (2606398.7811785787) 0.91
tpch_q3/vortex-in-memory-no-pushdown 156401426.53134924 ns (1264394.8536408693) 168585669.85849208 ns (1719372.4701914638) 0.93
tpch_q3/vortex-in-memory-pushdown 176417285.19321433 ns (995510.7910982072) 185539050.46666667 ns (1469873.0012500137) 0.95
tpch_q3/arrow 146979453.61079365 ns (716260.7854117006) 153025299.1290079 ns (983390.7308581471) 0.96
tpch_q3/parquet 332358032.3 ns (1373420.8856250048) 339625679.2 ns (2156728.1599999964) 0.98
tpch_q3/vortex-file-compressed 301080970.75 ns (2032361.4281250238) 303119066.15 ns (1839313.4356250167) 0.99
tpch_q3/vortex-file-uncompressed 243165003.5 ns (1397733.0012500137) 258796456.5 ns (2185567.8193749934) 0.94
tpch_q4/vortex-in-memory-no-pushdown 109884710.37480159 ns (544230.6131577417) 110929813.21170636 ns (560840.1128596142) 0.99
tpch_q4/vortex-in-memory-pushdown 129346848.66373017 ns (574563.9436527789) 128851363.83539681 ns (504112.7492380887) 1.00
tpch_q4/arrow 184954381.90000004 ns (1710520.227916658) 99690606.24726191 ns (633642.8505952358) 1.86
tpch_q4/parquet 204518099.1666667 ns (1329817.5229166448) 207030592 ns (1820389.6016666591) 0.99
tpch_q4/vortex-file-compressed 265242753.8 ns (1123142.3318749964) 253929579.25 ns (1429490) 1.04
tpch_q4/vortex-file-uncompressed 208543308.09999996 ns (2266364.5287499875) 202901853 ns (1240695.7595833242) 1.03
tpch_q5/vortex-in-memory-no-pushdown 282926029.7 ns (1238182.4399999976) 296659918.2 ns (1810368.585624963) 0.95
tpch_q5/vortex-in-memory-pushdown 288707952 ns (1973540.071875006) 303712736 ns (2551947.8787499666) 0.95
tpch_q5/arrow 269874597.95 ns (1856923.3837500066) 289773157.1 ns (3269323.2143749893) 0.93
tpch_q5/parquet 443423683.4 ns (2747039.025000006) 449956995.1 ns (2221295.025000006) 0.99
tpch_q5/vortex-file-compressed 371238116.6 ns (3075967.2337499857) 368195834.9 ns (3254630.2931250036) 1.01
tpch_q5/vortex-file-uncompressed 341243024.7 ns (3206195.5056250095) 358040640.4 ns (2901553.1837500036) 0.95
tpch_q6/vortex-in-memory-no-pushdown 34117962.90771164 ns (158795.83238888532) 36962499.711309515 ns (256376.831805557) 0.92
tpch_q6/vortex-in-memory-pushdown 73188521.35555556 ns (127946.31730555743) 73110425.8229762 ns (135212.1767589301) 1.00
tpch_q6/arrow 25929169.588998012 ns (252229.75238591246) 26701400.5934623 ns (242913.27273809724) 0.97
tpch_q6/parquet 137755075.45722222 ns (574591.6049097031) 138952108.97166666 ns (425453.83608332276) 0.99
tpch_q6/vortex-file-compressed 70789062.5315873 ns (304210.8354761824) 70100465.15450397 ns (215777.75718551874) 1.01
tpch_q6/vortex-file-uncompressed 173518362.02666667 ns (621230.6512499899) 170841284.3947619 ns (666149.0226845443) 1.02
tpch_q7/vortex-in-memory-no-pushdown 548430227 ns (4588948.486249983) 557500054.2 ns (7486243.068749964) 0.98
tpch_q7/vortex-in-memory-pushdown 578706992.6 ns (4015224.0250000358) 587219412.1 ns (5801932.550000012) 0.99
tpch_q7/arrow 536306682.6 ns (5481334.5) 544433191.5 ns (4200553.100000024) 0.99
tpch_q7/parquet 670218954.2 ns (6240897.006250024) 719078057.4 ns (6956319.29125005) 0.93
tpch_q7/vortex-file-compressed 693447074.8 ns (3942885.8499999642) 737836528 ns (8747501.86374998) 0.94
tpch_q7/vortex-file-uncompressed 667440043.8 ns (3906551.8124999404) 700322878 ns (13113821.689999998) 0.95
tpch_q8/vortex-in-memory-no-pushdown 220995154.33333334 ns (926146.715416655) 225801623.76666665 ns (1343238.9379166663) 0.98
tpch_q8/vortex-in-memory-pushdown 226591644.26666665 ns (1446436.520416677) 232328521.2 ns (1596291.8474999964) 0.98
tpch_q8/arrow 208925037.46666664 ns (1123084.2237499803) 210408762.70000002 ns (1570284.378333345) 0.99
tpch_q8/parquet 478454751.35 ns (2191826.7099999785) 499348132.7 ns (4146459.8612499833) 0.96
tpch_q8/vortex-file-compressed 307407136.7 ns (3143558.25) 316540657.2 ns (3401772.7462500036) 0.97
tpch_q8/vortex-file-uncompressed 295971658.2 ns (1682711.824999988) 316131361.15 ns (3565087.253125012) 0.94
tpch_q9/vortex-in-memory-no-pushdown 427162099.8 ns (2940204.3668750226) 432429641.7 ns (7629560.506249964) 0.99
tpch_q9/vortex-in-memory-pushdown 422702817.45 ns (5589998.897500008) 428732613.05 ns (4286218.75) 0.99
tpch_q9/arrow 397502078.65 ns (4154333.7612499893) 391596453.25 ns (2311446.815625012) 1.02
tpch_q9/parquet 703387005.5 ns (4805994.850000024) 706688494.1 ns (3243822.160000026) 1.00
tpch_q9/vortex-file-compressed 496545573.35 ns (5226849.657499999) 489020425.45 ns (6327085.983125001) 1.02
tpch_q9/vortex-file-uncompressed 488594729.35 ns (5173464.028124988) 499460872.9 ns (8193462.933750004) 0.98
tpch_q10/vortex-in-memory-no-pushdown 282667443.7 ns (3284445.6037499905) 271384609.45 ns (2677135.490625024) 1.04
tpch_q10/vortex-in-memory-pushdown 319002023.15 ns (3321430.451249987) 300897407.8 ns (1369469.775000006) 1.06
tpch_q10/arrow 275430571.4 ns (3079234.824999988) 262348668.1 ns (3921601.4993750006) 1.05
tpch_q10/parquet 510065306 ns (3406017.824374974) 516009912.9 ns (3761289.8350000083) 0.99
tpch_q10/vortex-file-compressed 425502410.45 ns (4946143.715624988) 491994013.6 ns (22188943.359999985) 0.86
tpch_q10/vortex-file-uncompressed 410803600.8 ns (2289278.2212499976) 456263672.2 ns (8820877.47312498) 0.90
tpch_q11/vortex-in-memory-no-pushdown 178316190.30829364 ns (1893781.964166671) 248410046.16666666 ns (3565511.1591666937) 0.72
tpch_q11/vortex-in-memory-pushdown 178281211.29694444 ns (1101750.8122222275) 245322206.13333336 ns (5092440.3783333) 0.73
tpch_q11/arrow 175703596.81456348 ns (2017177.2883333415) 239046864.2333333 ns (4558627.22708337) 0.74
tpch_q11/parquet 183229202.16182542 ns (1441414.31711708) 267303994.15 ns (7349397.921249986) 0.69
tpch_q11/vortex-file-compressed 259060882.75 ns (2345729.6400000006) 374936081.65 ns (7075873.791249961) 0.69
tpch_q11/vortex-file-uncompressed 262340077.6 ns (3002619.8793750107) 362400842.9 ns (5308243.284374982) 0.72
tpch_q12/vortex-in-memory-no-pushdown 229766000.6666667 ns (773763.4499999881) 234860590 ns (2880912.315416649) 0.98
tpch_q12/vortex-in-memory-pushdown 278801149.25 ns (961674.6993749738) 260164201.55 ns (3160403.574999988) 1.07
tpch_q12/arrow 186392367.53333336 ns (604136.3245833218) 182281093.76666668 ns (1529247.1999999732) 1.02
tpch_q12/parquet 334488061.2 ns (784486.7899999917) 373213779.05 ns (2562045.125) 0.90
tpch_q12/vortex-file-compressed 398130072.4 ns (1810861.849999994) 623881934.8 ns (6690392.074999988) 0.64
tpch_q12/vortex-file-uncompressed 413627965.85 ns (1759311.5806249976) 404514434.15 ns (4214349.900000006) 1.02
tpch_q13/vortex-in-memory-no-pushdown 169339103.88654763 ns (1429837.9698794782) 242211443.63333336 ns (2810251.2712499946) 0.70
tpch_q13/vortex-in-memory-pushdown 169660139.58980158 ns (1160805.6279905885) 242466250 ns (4691416.808750004) 0.70
tpch_q13/arrow 166735813.05107144 ns (1405568.3984642625) 236174184.33333334 ns (3143065.627916664) 0.71
tpch_q13/parquet 317681930.9 ns (1358871.337500006) 439478837.5 ns (7199786.123124987) 0.72
tpch_q13/vortex-file-compressed 199714498.63333336 ns (1513290.9945833385) 263761832.15 ns (3236403.262500003) 0.76
tpch_q13/vortex-file-uncompressed 205059287.79999998 ns (1111704.1024999917) 259183041.05 ns (3921578.1318749934) 0.79
tpch_q14/vortex-in-memory-no-pushdown 41552019.81275132 ns (411656.42037037015) 59090715.40587302 ns (1457002.635277778) 0.70
tpch_q14/vortex-in-memory-pushdown 72914783.72021826 ns (316158.26194369793) 95154309.49095239 ns (1837986.5249999985) 0.77
tpch_q14/arrow 33777570.41742063 ns (249550.93843006156) 49661404.367698416 ns (1186297.5619523823) 0.68
tpch_q14/parquet 224867538.8666667 ns (947894.1770833284) 260143722.6 ns (2617170.920000002) 0.86
tpch_q14/vortex-file-compressed 117212363.29761906 ns (504656.39719641954) 145157297.6920238 ns (2191605.497784227) 0.81
tpch_q14/vortex-file-uncompressed 133654442.27761905 ns (524689.9304761887) 166803202.16202384 ns (2743562.2645833343) 0.80
tpch_q15/vortex-in-memory-no-pushdown 69194037.57757936 ns (345876.91210764647) 97543698.12071428 ns (3811881.724330358) 0.71
tpch_q15/vortex-in-memory-pushdown 102051015.98896825 ns (545445.7108085304) 130867582.60472222 ns (2252614.2953472286) 0.78
tpch_q15/arrow 56533203.80867064 ns (528867.2160820924) 79101177.35978174 ns (2033580.4924496487) 0.71
tpch_q15/parquet 298189852.7 ns (1249349.949999988) 357462679.4 ns (2294735.419375032) 0.83
tpch_q15/vortex-file-compressed 219940136.9 ns (703391.5320833623) 278555601.15 ns (4687207.900000006) 0.79
tpch_q15/vortex-file-uncompressed 251572079.9 ns (1060828.0243750066) 322510851.05 ns (5087468.290625036) 0.78
tpch_q16/vortex-in-memory-no-pushdown 106076175.93686506 ns (270004.35918700695) 134602053.6325397 ns (1332997.8143749982) 0.79
tpch_q16/vortex-in-memory-pushdown 118225620.92952383 ns (692116.8500000015) 154460726.12305555 ns (3262245.864163175) 0.77
tpch_q16/arrow 104568760.13579366 ns (606941.4019126967) 139287119.14916664 ns (7283811.355802074) 0.75
tpch_q16/parquet 114289208.02952382 ns (583125.0278571397) 155549358.7791667 ns (4413715.093333334) 0.73
tpch_q16/vortex-file-compressed 132874416.88583331 ns (679290.8591666743) 176560583.34765872 ns (4735213.216140881) 0.75
tpch_q16/vortex-file-uncompressed 132643387.40916666 ns (793355.3152395859) 175860272.73349205 ns (4666078.278177589) 0.75
tpch_q17/vortex-in-memory-no-pushdown 498345946.5 ns (5872183.747500002) 886250079.7 ns (14649632.150000036) 0.56
tpch_q17/vortex-in-memory-pushdown 578935204.6 ns (6090160.096249998) 1004202188.2 ns (13892720.852500021) 0.58
tpch_q17/arrow 493812384.65 ns (7563382.558749974) 890048848.1 ns (14903918.600000024) 0.55
tpch_q17/parquet 628501567.2 ns (4661316.86500001) 995268242.1 ns (25283188.216250002) 0.63
tpch_q17/vortex-file-compressed 612792689.9 ns (4828017.157500029) 897375605.1 ns (15112599.800000012) 0.68
tpch_q17/vortex-file-uncompressed 594805764.8 ns (4172391.2349999547) 913245390.5 ns (17272778.069999993) 0.65
tpch_q18/vortex-in-memory-no-pushdown 1047204943.4 ns (8328714.766250014) 1652359307.5 ns (11207843.828749895) 0.63
tpch_q18/vortex-in-memory-pushdown 1048673877.6 ns (7599721.732500017) 1656838563.1 ns (25127484.75750017) 0.63
tpch_q18/arrow 1030317315.6 ns (8994895.867499948) 1612579005.2 ns (17381190.068750024) 0.64
tpch_q18/parquet 1188743490 ns (6169362.44749999) 1855190173.6 ns (33566005.44124997) 0.64
tpch_q18/vortex-file-compressed 1115937712.8 ns (7827238.078750014) 1655932163.5 ns (26524335.100000024) 0.67
tpch_q18/vortex-file-uncompressed 1076786040.3 ns (7218632.709999919) 1690322158.2 ns (11057512.153749943) 0.64
tpch_q19/vortex-in-memory-no-pushdown 178195884.3736905 ns (227495.8034836352) 171194590.41686508 ns (927672.6262896806) 1.04
tpch_q19/vortex-in-memory-pushdown 297974388.8 ns (793656.8131250143) 263291465 ns (1900072.951249987) 1.13
tpch_q19/arrow 164078965.09857142 ns (376039.8984553367) 149200699.80583334 ns (1511146.0254166573) 1.10
tpch_q19/parquet 451375303.2 ns (691238.3662500083) 512331562.4 ns (3838719.75) 0.88
tpch_q19/vortex-file-compressed 346670517.1 ns (1883176.0056249797) 678641035.1 ns (5280641.90625) 0.51
tpch_q19/vortex-file-uncompressed 399273280.95 ns (1651251.7712499797) 375027090.45 ns (4528329.668749988) 1.06
tpch_q20/vortex-in-memory-no-pushdown 236536792.7666667 ns (1380833.1833333075) 355327056.4 ns (7235762.118124992) 0.67
tpch_q20/vortex-in-memory-pushdown 253502338.85 ns (2999846.5068749934) 375103968.25 ns (7165945.238749981) 0.68
tpch_q20/arrow 231359424 ns (1471685.2525000274) 341955229.05 ns (4394651.800000012) 0.68
tpch_q20/parquet 350191606.05 ns (1917368.7081249952) 481737621.65 ns (8824038.125625014) 0.73
tpch_q20/vortex-file-compressed 344849096.05 ns (2097900.5462500155) 490913706.2 ns (7274789.5) 0.70
tpch_q20/vortex-file-uncompressed 363133703.35 ns (1740499.824999988) 506868405.2 ns (9090580.346249998) 0.72
tpch_q21/vortex-in-memory-no-pushdown 875079883.4 ns (8249457.899999976) 1274553567.8 ns (20404614.254999995) 0.69
tpch_q21/vortex-in-memory-pushdown 909013730.2 ns (5600166.356249988) 1341093376.6 ns (22271219.026250005) 0.68
tpch_q21/arrow 853873633.1 ns (4461299.816250026) 1245127568.9 ns (18511430.79125011) 0.69
tpch_q21/parquet 969898820.5 ns (5013599.599999964) 1417374119.5 ns (18871539.378749967) 0.68
tpch_q21/vortex-file-compressed 1207155028.8 ns (8075577.897500038) 1652858074.1 ns (15455272.492499948) 0.73
tpch_q21/vortex-file-uncompressed 1064370007.5 ns (4060862.664999962) 1522393309 ns (18039752.753749967) 0.70
tpch_q22/vortex-in-memory-no-pushdown 76814383.31785713 ns (119861.83134970814) 78555381.00246033 ns (1006435.1539355144) 0.98
tpch_q22/vortex-in-memory-pushdown 76197988.66253969 ns (200792.9768293649) 71390285.5113492 ns (3391152.1570634916) 1.07
tpch_q22/arrow 75128350.72349207 ns (213899.22990475595) 78166457.82813492 ns (1215018.0291989148) 0.96
tpch_q22/parquet 95006838.09448412 ns (592526.9788546711) 121589548.33273809 ns (2869093.265714295) 0.78
tpch_q22/vortex-file-compressed 120191489.36793652 ns (410456.6429920569) 137283016.99503967 ns (2939342.282343745) 0.88
tpch_q22/vortex-file-uncompressed 119720668.10642858 ns (703105.0009226128) 134274281.11099207 ns (2257058.6796408743) 0.89

This comment was automatically generated by workflow using github-action-benchmark.

@a10y
Copy link
Contributor Author

a10y commented Oct 23, 2024

I'm a be hesitant to specialize into_canonical based on if the child is already canonicalized or not since we don't really do that anywhere else.

Also I am getting much more consistent results running locally now than I previously was. I really don't know what's up with that. One change is that I'm not consistently running with taskpolicy -c utility which on macOS should force the process tree to use Performance Cores instead of Efficiency Cores.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vortex Compression

Benchmark suite Current: 4dff5e9 Previous: e8a2bf5 Ratio
compress time/taxi 1212824288.7 ns (2048013.2999999523) 1185757225.6 ns (9006289.654999971) 1.02
compress time/taxi throughput 470808924 bytes 470808924 bytes 1
parquet_rs-zstd compress time/taxi 1797379845.4 ns (5855986.25) 1796696829.5 ns (5758917.100000024) 1.00
parquet_rs-zstd compress time/taxi throughput 470808924 bytes 470808924 bytes 1
decompress time/taxi 407151982.3 ns (1150718.150000006) 430395653.7 ns (4437686.099999994) 0.95
decompress time/taxi throughput 470808924 bytes 470808924 bytes 1
parquet_rs-zstd decompress time/taxi 310576863.2 ns (671358.7106250226) 316308581.65 ns (737391.6674999893) 0.98
parquet_rs-zstd decompress time/taxi throughput 470808924 bytes 470808924 bytes 1
vortex:parquet-zstd size/taxi 0.9451901786267926 ratio 0.9404634421839141 ratio 1.01
vortex:raw size/taxi 0.1123451432284236 ratio 0.11178359057612086 ratio 1.01
vortex size/taxi 52893096 bytes 52628712 bytes 1.01
compress time/AirlineSentiment 685080.6924066119 ns (2093.635176259151) 345841.9193756012 ns (1255.4738823465304) 1.98
compress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
parquet_rs-zstd compress time/AirlineSentiment 56355.373778355155 ns (247.60012440060746) 55311.27757209518 ns (146.48514490044545) 1.02
parquet_rs-zstd compress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
decompress time/AirlineSentiment 39504.57695798319 ns (76.55545502645691) 36324.921224111415 ns (119.82388920681115) 1.09
decompress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
parquet_rs-zstd decompress time/AirlineSentiment 31972.720953151897 ns (127.6207133437656) 33808.58412979547 ns (71.5679512262468) 0.95
parquet_rs-zstd decompress time/AirlineSentiment throughput 2020 bytes 2020 bytes 1
vortex:parquet-zstd size/AirlineSentiment 6.196483971044468 ratio 5.3360910031023785 ratio 1.16
vortex:raw size/AirlineSentiment 2.9663366336633663 ratio 2.5544554455445545 ratio 1.16
vortex size/AirlineSentiment 5992 bytes 5160 bytes 1.16
compress time/Arade 2211194476.7 ns (5526418.127499819) 1740851367.7 ns (5215863.825000048) 1.27
compress time/Arade throughput 787023760 bytes 787023760 bytes 1
parquet_rs-zstd compress time/Arade 3028143823.2 ns (15291987.848750114) 3037810424.1 ns (12284171.221249819) 1.00
parquet_rs-zstd compress time/Arade throughput 787023760 bytes 787023760 bytes 1
decompress time/Arade 839297248.4 ns (3988917.4487499595) 881327522.7 ns (4294002.328750014) 0.95
decompress time/Arade throughput 787023760 bytes 787023760 bytes 1
parquet_rs-zstd decompress time/Arade 653682463.7 ns (2012521.5975000262) 708498860.8 ns (3013795.5912500024) 0.92
parquet_rs-zstd decompress time/Arade throughput 787023760 bytes 787023760 bytes 1
vortex:parquet-zstd size/Arade 0.47890649124129325 ratio 0.4783576380591069 ratio 1.00
vortex:raw size/Arade 0.1858328749820717 ratio 0.18561973783358204 ratio 1.00
vortex size/Arade 146254888 bytes 146087144 bytes 1.00
compress time/Bimbo 11006644524.7 ns (42272733.29749966) 10141981586.2 ns (34211321.55000019) 1.09
compress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
parquet_rs-zstd compress time/Bimbo 21948391875.6 ns (67587018.70000076) 22028921103 ns (71036671.20000076) 1.00
parquet_rs-zstd compress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
decompress time/Bimbo 4984171203.9 ns (75178959.19624996) 4752944291.1 ns (17437573.10000038) 1.05
decompress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
parquet_rs-zstd decompress time/Bimbo 2622286100.5 ns (6639336.200000048) 2640856546.5 ns (7964505.649999857) 0.99
parquet_rs-zstd decompress time/Bimbo throughput 7121333608 bytes 7121333608 bytes 1
vortex:parquet-zstd size/Bimbo 1.2342947693780078 ratio 1.237048831798617 ratio 1.00
vortex:raw size/Bimbo 0.06727570120599242 ratio 0.06742581241533095 ratio 1.00
vortex size/Bimbo 479092712 bytes 480161704 bytes 1.00
compress time/CMSprovider 11921960739.1 ns (34508233.44124985) 10785399512.3 ns (26747964.14750099) 1.11
compress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
parquet_rs-zstd compress time/CMSprovider 18856264259.5 ns (107270901.88125038) 19133771162.2 ns (78319086.94374847) 0.99
parquet_rs-zstd compress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
decompress time/CMSprovider 7248438199.1 ns (82012425.26375008) 7677925038.3 ns (62663527.93999958) 0.94
decompress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
parquet_rs-zstd decompress time/CMSprovider 4843772287 ns (15514357.272500038) 7163385081.3 ns (53850493.32625008) 0.68
parquet_rs-zstd decompress time/CMSprovider throughput 5149123964 bytes 5149123964 bytes 1
vortex:parquet-zstd size/CMSprovider 1.2016422792918104 ratio 1.1156992506632968 ratio 1.08
vortex:raw size/CMSprovider 0.17958061652135435 ratio 0.16673530915209483 ratio 1.08
vortex size/CMSprovider 924682856 bytes 858540776 bytes 1.08
compress time/Euro2016 2663634496.8 ns (3128941.263749838) 2549121530.6 ns (5637986.262500048) 1.04
compress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
parquet_rs-zstd compress time/Euro2016 1542526096.6 ns (2874438.2087500095) 1604360430 ns (7553769.254999876) 0.96
parquet_rs-zstd compress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
decompress time/Euro2016 292149756.65 ns (1368860.224999994) 375466993.85 ns (4247326.450000018) 0.78
decompress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
parquet_rs-zstd decompress time/Euro2016 484114007.6 ns (2035620.9750000238) 549549952.3 ns (6334449.348749995) 0.88
parquet_rs-zstd decompress time/Euro2016 throughput 393253221 bytes 393253221 bytes 1
vortex:parquet-zstd size/Euro2016 1.4383489987803337 ratio 1.438261253727756 ratio 1.00
vortex:raw size/Euro2016 0.4348474490943839 ratio 0.43482092165749864 ratio 1.00
vortex size/Euro2016 171005160 bytes 170994728 bytes 1.00
compress time/Food 1091922775.8 ns (2506663.7950000763) 920833431.3 ns (4865458.1875) 1.19
compress time/Food throughput 332718229 bytes 332718229 bytes 1
parquet_rs-zstd compress time/Food 1083319511.6 ns (3542324.4487501383) 1108020702.6 ns (4557758.612499952) 0.98
parquet_rs-zstd compress time/Food throughput 332718229 bytes 332718229 bytes 1
decompress time/Food 199410609.8333333 ns (843364.0945833623) 296221990 ns (4334934.709375024) 0.67
decompress time/Food throughput 332718229 bytes 332718229 bytes 1
parquet_rs-zstd decompress time/Food 213283147.86666664 ns (298656.57541666925) 264842432.05 ns (1285652.7287499905) 0.81
parquet_rs-zstd decompress time/Food throughput 332718229 bytes 332718229 bytes 1
vortex:parquet-zstd size/Food 1.2389367897825485 ratio 1.2376719820663602 ratio 1.00
vortex:raw size/Food 0.1349090374005327 ratio 0.13477131125268163 ratio 1.00
vortex size/Food 44886696 bytes 44840872 bytes 1.00
compress time/HashTags 2573898061.3 ns (3817377.8525002003) 2600815786.9 ns (24140547.423749924) 0.99
compress time/HashTags throughput 804495592 bytes 804495592 bytes 1
parquet_rs-zstd compress time/HashTags 2507278500.9 ns (3274795.53125) 2509948798.2 ns (22641953.28250003) 1.00
parquet_rs-zstd compress time/HashTags throughput 804495592 bytes 804495592 bytes 1
decompress time/HashTags 607326544 ns (3488551.649999976) 653761586.5 ns (11210621.683750033) 0.93
decompress time/HashTags throughput 804495592 bytes 804495592 bytes 1
parquet_rs-zstd decompress time/HashTags 790326759.7 ns (4089888.9000000358) 927206782 ns (15067763) 0.85
parquet_rs-zstd decompress time/HashTags throughput 804495592 bytes 804495592 bytes 1
vortex:parquet-zstd size/HashTags 1.6563035357058875 ratio 1.5877552591936768 ratio 1.04
vortex:raw size/HashTags 0.2758095012657322 ratio 0.2643947662549778 ratio 1.04
vortex size/HashTags 221887528 bytes 212704424 bytes 1.04
compress time/TPC-H l_comment chunked without fsst 4027689275.6 ns (32622993.236249924) 240046710.73333335 ns (2455574.450000003) 16.78
compress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 183010921 bytes 1.36
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst 914728202.6 ns (1836864.4087499976) 903099197.6 ns (3021412.815000057) 1.01
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 183010921 bytes 1.36
decompress time/TPC-H l_comment chunked without fsst 117351179.68603174 ns (1863893.5318125188) 77316087.2052381 ns (1024381.7883913592) 1.52
decompress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 183010921 bytes 1.36
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst 252442577.95 ns (784528.2250000089) 290392124.5 ns (805332.5499999821) 0.87
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst throughput 249197090 bytes 183010921 bytes 1.36
vortex:parquet-zstd size/TPC-H l_comment chunked without fsst 4.607462455357942 ratio 3.2153398378514733 ratio 1.43
vortex:raw size/TPC-H l_comment chunked without fsst 1.0527068995869895 ratio 1.0003423566181604 ratio 1.05
vortex size/TPC-H l_comment chunked without fsst 262331496 bytes 183073576 bytes 1.43
compress time/TPC-H l_comment chunked 926223400 ns (2455767.5474999547) 856190447 ns (2455851.2849999666) 1.08
compress time/TPC-H l_comment chunked throughput 249197090 bytes 183010921 bytes 1.36
parquet_rs-zstd compress time/TPC-H l_comment chunked 917565649.2 ns (1076415.2262499928) 903042083.1 ns (3640622.9850000143) 1.02
parquet_rs-zstd compress time/TPC-H l_comment chunked throughput 249197090 bytes 183010921 bytes 1.36
decompress time/TPC-H l_comment chunked 132302766.70575397 ns (447520.2361840308) 106149572.51742063 ns (755850.3878769875) 1.25
decompress time/TPC-H l_comment chunked throughput 249197090 bytes 183010921 bytes 1.36
parquet_rs-zstd decompress time/TPC-H l_comment chunked 250874699.3 ns (584755.5174999982) 291488322.95 ns (876721.4006249905) 0.86
parquet_rs-zstd decompress time/TPC-H l_comment chunked throughput 249197090 bytes 183010921 bytes 1.36
vortex:parquet-zstd size/TPC-H l_comment chunked 1.347931984244827 ratio 1.3549358313637339 ratio 0.99
vortex:raw size/TPC-H l_comment chunked 0.30797370868175067 ratio 0.42154166307922136 ratio 0.73
vortex size/TPC-H l_comment chunked 76746152 bytes 77146728 bytes 0.99
compress time/TPC-H l_comment canonical 919916534.55 ns (1006700.4250000119) 865583731.05 ns (1401594.125) 1.06
compress time/TPC-H l_comment canonical throughput 249197106 bytes 183010937 bytes 1.36
parquet_rs-zstd compress time/TPC-H l_comment canonical 910357936.05 ns (2426499.088750005) 910724776.25 ns (3818331.675000012) 1.00
parquet_rs-zstd compress time/TPC-H l_comment canonical throughput 249197106 bytes 183010937 bytes 1.36
decompress time/TPC-H l_comment canonical 131890653.7084656 ns (674572.5784120411) 103944936.02511907 ns (905152.047263898) 1.27
decompress time/TPC-H l_comment canonical throughput 249197106 bytes 183010937 bytes 1.36
parquet_rs-zstd decompress time/TPC-H l_comment canonical 252424689.66450396 ns (662829.5042877197) 288763842.7907143 ns (643178.1422649026) 0.87
parquet_rs-zstd decompress time/TPC-H l_comment canonical throughput 249197106 bytes 183010937 bytes 1.36
vortex:parquet-zstd size/TPC-H l_comment canonical 1.34793600890824 ratio 1.3549369260209065 ratio 0.99
vortex:raw size/TPC-H l_comment canonical 0.30797368890792814 ratio 0.42154162622532226 ratio 0.73
vortex size/TPC-H l_comment canonical 76746152 bytes 77146728 bytes 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@a10y
Copy link
Contributor Author

a10y commented Oct 23, 2024

FWIW I see similar levels of large +/- in runs on my laptop for develop as well

image

@a10y a10y merged commit 3277e18 into develop Oct 23, 2024
10 checks passed
@a10y a10y deleted the aduffy/non-prim-views branch October 23, 2024 19:25
@robert3005
Copy link
Member

Great, ok. Benchmarking is hard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants