Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] dist_params_keys and dist_sample_keys #1179

Open
wants to merge 1 commit into
base: gh/vmoens/45/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 10, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 10, 2025
ghstack-source-id: d1e53e780132d04ddf37d613358b24467520230f
Pull Request resolved: #1179
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 10, 2025
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 77.0240μs 19.5306μs 51.2018 KOps/s 51.0998 KOps/s $\color{#35bf28}+0.20\%$
test_plain_set_stack_nested 65.3610μs 20.2114μs 49.4769 KOps/s 50.3589 KOps/s $\color{#d91a1a}-1.75\%$
test_plain_set_nested_inplace 74.2380μs 21.4686μs 46.5797 KOps/s 46.1404 KOps/s $\color{#35bf28}+0.95\%$
test_plain_set_stack_nested_inplace 60.3930μs 21.7115μs 46.0585 KOps/s 46.6502 KOps/s $\color{#d91a1a}-1.27\%$
test_items 54.5210μs 4.1163μs 242.9366 KOps/s 234.2485 KOps/s $\color{#35bf28}+3.71\%$
test_items_nested 0.5375ms 0.3956ms 2.5276 KOps/s 2.4140 KOps/s $\color{#35bf28}+4.71\%$
test_items_nested_locked 0.9442ms 0.3945ms 2.5350 KOps/s 2.4254 KOps/s $\color{#35bf28}+4.52\%$
test_items_nested_leaf 0.1438ms 76.4101μs 13.0873 KOps/s 13.0330 KOps/s $\color{#35bf28}+0.42\%$
test_items_stack_nested 0.5783ms 0.3971ms 2.5183 KOps/s 2.4033 KOps/s $\color{#35bf28}+4.78\%$
test_items_stack_nested_leaf 0.1505ms 77.7467μs 12.8623 KOps/s 12.6070 KOps/s $\color{#35bf28}+2.02\%$
test_items_stack_nested_locked 0.5214ms 0.3928ms 2.5456 KOps/s 2.3518 KOps/s $\textbf{\color{#35bf28}+8.24\%}$
test_keys 30.7870μs 3.4731μs 287.9246 KOps/s 286.3286 KOps/s $\color{#35bf28}+0.56\%$
test_keys_nested 0.2635ms 0.1635ms 6.1145 KOps/s 5.9744 KOps/s $\color{#35bf28}+2.35\%$
test_keys_nested_locked 2.1645ms 0.1714ms 5.8344 KOps/s 5.8088 KOps/s $\color{#35bf28}+0.44\%$
test_keys_nested_leaf 0.2609ms 0.1460ms 6.8490 KOps/s 6.8259 KOps/s $\color{#35bf28}+0.34\%$
test_keys_stack_nested 0.2705ms 0.1646ms 6.0770 KOps/s 6.0050 KOps/s $\color{#35bf28}+1.20\%$
test_keys_stack_nested_leaf 0.2276ms 0.1450ms 6.8950 KOps/s 7.0201 KOps/s $\color{#d91a1a}-1.78\%$
test_keys_stack_nested_locked 0.3177ms 0.1717ms 5.8257 KOps/s 5.8286 KOps/s $\color{#d91a1a}-0.05\%$
test_values 9.1370μs 1.1260μs 888.0937 KOps/s 945.0227 KOps/s $\textbf{\color{#d91a1a}-6.02\%}$
test_values_nested 0.3671ms 65.5121μs 15.2644 KOps/s 16.1774 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_values_nested_locked 0.1247ms 62.1462μs 16.0911 KOps/s 15.9337 KOps/s $\color{#35bf28}+0.99\%$
test_values_nested_leaf 0.1476ms 71.3493μs 14.0156 KOps/s 13.8993 KOps/s $\color{#35bf28}+0.84\%$
test_values_stack_nested 0.1283ms 62.7663μs 15.9321 KOps/s 15.6357 KOps/s $\color{#35bf28}+1.90\%$
test_values_stack_nested_leaf 0.1649ms 71.3454μs 14.0163 KOps/s 14.1682 KOps/s $\color{#d91a1a}-1.07\%$
test_values_stack_nested_locked 0.1281ms 62.9489μs 15.8859 KOps/s 15.3892 KOps/s $\color{#35bf28}+3.23\%$
test_membership 31.3180μs 0.8638μs 1.1576 MOps/s 1.1760 MOps/s $\color{#d91a1a}-1.56\%$
test_membership_nested 27.6310μs 2.8896μs 346.0742 KOps/s 335.1465 KOps/s $\color{#35bf28}+3.26\%$
test_membership_nested_leaf 0.1043ms 2.8911μs 345.8884 KOps/s 331.8641 KOps/s $\color{#35bf28}+4.23\%$
test_membership_stacked_nested 29.4650μs 2.9037μs 344.3835 KOps/s 331.6313 KOps/s $\color{#35bf28}+3.85\%$
test_membership_stacked_nested_leaf 63.5880μs 2.8773μs 347.5526 KOps/s 337.3847 KOps/s $\color{#35bf28}+3.01\%$
test_membership_nested_last 36.3680μs 4.4379μs 225.3323 KOps/s 226.7926 KOps/s $\color{#d91a1a}-0.64\%$
test_membership_nested_leaf_last 30.5870μs 4.4701μs 223.7081 KOps/s 223.9099 KOps/s $\color{#d91a1a}-0.09\%$
test_membership_stacked_nested_last 50.6540μs 4.4134μs 226.5841 KOps/s 159.0741 KOps/s $\textbf{\color{#35bf28}+42.44\%}$
test_membership_stacked_nested_leaf_last 49.4920μs 4.4817μs 223.1277 KOps/s 162.0865 KOps/s $\textbf{\color{#35bf28}+37.66\%}$
test_nested_getleaf 53.9900μs 10.8734μs 91.9678 KOps/s 90.1668 KOps/s $\color{#35bf28}+2.00\%$
test_nested_get 62.2360μs 10.4157μs 96.0087 KOps/s 94.9724 KOps/s $\color{#35bf28}+1.09\%$
test_stacked_getleaf 36.2480μs 10.8775μs 91.9327 KOps/s 91.7479 KOps/s $\color{#35bf28}+0.20\%$
test_stacked_get 64.1870μs 10.0369μs 99.6326 KOps/s 96.5908 KOps/s $\color{#35bf28}+3.15\%$
test_nested_getitemleaf 33.3730μs 11.3373μs 88.2041 KOps/s 89.1791 KOps/s $\color{#d91a1a}-1.09\%$
test_nested_getitem 0.1323ms 11.0002μs 90.9074 KOps/s 92.9425 KOps/s $\color{#d91a1a}-2.19\%$
test_stacked_getitemleaf 84.4070μs 11.7769μs 84.9118 KOps/s 89.4261 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_stacked_getitem 38.5010μs 10.7728μs 92.8263 KOps/s 93.3921 KOps/s $\color{#d91a1a}-0.61\%$
test_lock_nested 4.8555ms 0.4712ms 2.1221 KOps/s 1.6904 KOps/s $\textbf{\color{#35bf28}+25.54\%}$
test_lock_stack_nested 0.7330ms 0.4334ms 2.3075 KOps/s 2.3473 KOps/s $\color{#d91a1a}-1.69\%$
test_unlock_nested 0.9661ms 0.3866ms 2.5866 KOps/s 2.5966 KOps/s $\color{#d91a1a}-0.39\%$
test_unlock_stack_nested 0.6330ms 0.3527ms 2.8356 KOps/s 2.9214 KOps/s $\color{#d91a1a}-2.94\%$
test_flatten_speed 0.2014ms 99.6332μs 10.0368 KOps/s 10.0252 KOps/s $\color{#35bf28}+0.12\%$
test_unflatten_speed 0.6659ms 0.5256ms 1.9026 KOps/s 1.9091 KOps/s $\color{#d91a1a}-0.34\%$
test_common_ops 5.8964ms 0.7744ms 1.2912 KOps/s 1.2647 KOps/s $\color{#35bf28}+2.10\%$
test_creation 62.8170μs 2.4968μs 400.5123 KOps/s 401.2795 KOps/s $\color{#d91a1a}-0.19\%$
test_creation_empty 38.8420μs 10.1092μs 98.9198 KOps/s 98.9460 KOps/s $\color{#d91a1a}-0.03\%$
test_creation_nested_1 1.4387ms 13.0956μs 76.3613 KOps/s 77.5916 KOps/s $\color{#d91a1a}-1.59\%$
test_creation_nested_2 69.0180μs 17.4206μs 57.4033 KOps/s 57.1303 KOps/s $\color{#35bf28}+0.48\%$
test_clone 0.1769ms 13.5191μs 73.9694 KOps/s 73.0654 KOps/s $\color{#35bf28}+1.24\%$
test_getitem[int] 1.0041ms 12.9001μs 77.5185 KOps/s 76.5195 KOps/s $\color{#35bf28}+1.31\%$
test_getitem[slice_int] 0.1725ms 25.0804μs 39.8718 KOps/s 39.9405 KOps/s $\color{#d91a1a}-0.17\%$
test_getitem[range] 0.3587ms 48.3783μs 20.6704 KOps/s 19.3248 KOps/s $\textbf{\color{#35bf28}+6.96\%}$
test_getitem[tuple] 0.1426ms 20.2340μs 49.4218 KOps/s 48.3110 KOps/s $\color{#35bf28}+2.30\%$
test_getitem[list] 0.4922ms 44.1019μs 22.6748 KOps/s 21.9573 KOps/s $\color{#35bf28}+3.27\%$
test_setitem_dim[int] 52.9190μs 27.3531μs 36.5589 KOps/s 38.6028 KOps/s $\textbf{\color{#d91a1a}-5.29\%}$
test_setitem_dim[slice_int] 0.1076ms 54.6305μs 18.3048 KOps/s 18.9883 KOps/s $\color{#d91a1a}-3.60\%$
test_setitem_dim[range] 0.1803ms 75.3264μs 13.2756 KOps/s 13.0540 KOps/s $\color{#35bf28}+1.70\%$
test_setitem_dim[tuple] 91.5600μs 42.6077μs 23.4699 KOps/s 24.0395 KOps/s $\color{#d91a1a}-2.37\%$
test_setitem 0.2032ms 20.2481μs 49.3873 KOps/s 49.5501 KOps/s $\color{#d91a1a}-0.33\%$
test_set 0.1961ms 19.5557μs 51.1361 KOps/s 51.7889 KOps/s $\color{#d91a1a}-1.26\%$
test_set_shared 4.3035ms 0.1794ms 5.5734 KOps/s 5.7468 KOps/s $\color{#d91a1a}-3.02\%$
test_update 0.2573ms 21.4982μs 46.5156 KOps/s 46.5660 KOps/s $\color{#d91a1a}-0.11\%$
test_update_nested 0.2435ms 31.6675μs 31.5781 KOps/s 30.7203 KOps/s $\color{#35bf28}+2.79\%$
test_update__nested 0.6251ms 33.5973μs 29.7643 KOps/s 28.9561 KOps/s $\color{#35bf28}+2.79\%$
test_set_nested 0.1806ms 21.9689μs 45.5189 KOps/s 44.8212 KOps/s $\color{#35bf28}+1.56\%$
test_set_nested_new 0.2489ms 26.3998μs 37.8790 KOps/s 36.9477 KOps/s $\color{#35bf28}+2.52\%$
test_select 0.2338ms 43.0620μs 23.2223 KOps/s 22.9644 KOps/s $\color{#35bf28}+1.12\%$
test_select_nested 0.1398ms 63.8804μs 15.6542 KOps/s 15.8080 KOps/s $\color{#d91a1a}-0.97\%$
test_exclude_nested 0.1704ms 82.9090μs 12.0614 KOps/s 12.1537 KOps/s $\color{#d91a1a}-0.76\%$
test_empty[True] 0.6254ms 0.4149ms 2.4100 KOps/s 2.3839 KOps/s $\color{#35bf28}+1.09\%$
test_empty[False] 14.7422μs 1.3797μs 724.7967 KOps/s 732.6409 KOps/s $\color{#d91a1a}-1.07\%$
test_unbind_speed 0.4038ms 0.2712ms 3.6873 KOps/s 3.7443 KOps/s $\color{#d91a1a}-1.52\%$
test_unbind_speed_stack0 0.4710ms 0.2687ms 3.7222 KOps/s 3.8476 KOps/s $\color{#d91a1a}-3.26\%$
test_unbind_speed_stack1 0.1296s 0.8432ms 1.1860 KOps/s 1.3621 KOps/s $\textbf{\color{#d91a1a}-12.93\%}$
test_split 1.9613ms 1.6319ms 612.7949 Ops/s 550.4267 Ops/s $\textbf{\color{#35bf28}+11.33\%}$
test_chunk 0.1257s 1.8780ms 532.4877 Ops/s 542.4279 Ops/s $\color{#d91a1a}-1.83\%$
test_consolidate_njt[False-None] 9.2872ms 8.5270ms 117.2748 Ops/s 118.8159 Ops/s $\color{#d91a1a}-1.30\%$
test_creation[device0] 4.9837ms 96.0197μs 10.4145 KOps/s 10.4479 KOps/s $\color{#d91a1a}-0.32\%$
test_creation_from_tensor 0.3604ms 97.9043μs 10.2141 KOps/s 10.4791 KOps/s $\color{#d91a1a}-2.53\%$
test_add_one[memmap_tensor0] 0.3092ms 5.1009μs 196.0445 KOps/s 205.1919 KOps/s $\color{#d91a1a}-4.46\%$
test_contiguous[memmap_tensor0] 0.1150ms 0.6189μs 1.6158 MOps/s 1.8426 MOps/s $\textbf{\color{#d91a1a}-12.31\%}$
test_stack[memmap_tensor0] 0.1032ms 3.4819μs 287.2024 KOps/s 287.9906 KOps/s $\color{#d91a1a}-0.27\%$
test_memmaptd_index 0.1291s 0.3140ms 3.1844 KOps/s 4.2836 KOps/s $\textbf{\color{#d91a1a}-25.66\%}$
test_memmaptd_index_astensor 0.6628ms 0.3198ms 3.1270 KOps/s 3.1426 KOps/s $\color{#d91a1a}-0.49\%$
test_memmaptd_index_op 1.2783ms 0.5691ms 1.7573 KOps/s 1.7707 KOps/s $\color{#d91a1a}-0.76\%$
test_serialize_model 0.1346s 0.1240s 8.0661 Ops/s 8.4290 Ops/s $\color{#d91a1a}-4.31\%$
test_serialize_model_pickle 0.4469s 0.3962s 2.5242 Ops/s 2.5385 Ops/s $\color{#d91a1a}-0.56\%$
test_serialize_weights 0.1382s 0.1193s 8.3801 Ops/s 7.1432 Ops/s $\textbf{\color{#35bf28}+17.32\%}$
test_serialize_weights_returnearly 0.2868s 0.1832s 5.4588 Ops/s 6.2388 Ops/s $\textbf{\color{#d91a1a}-12.50\%}$
test_serialize_weights_pickle 1.2928s 0.7486s 1.3358 Ops/s 2.4539 Ops/s $\textbf{\color{#d91a1a}-45.57\%}$
test_serialize_weights_filesystem 0.1593s 0.1503s 6.6520 Ops/s 6.5927 Ops/s $\color{#35bf28}+0.90\%$
test_serialize_model_filesystem 0.1770s 0.1543s 6.4828 Ops/s 6.4052 Ops/s $\color{#35bf28}+1.21\%$
test_reshape_pytree 58.8700μs 26.1471μs 38.2452 KOps/s 37.6735 KOps/s $\color{#35bf28}+1.52\%$
test_reshape_td 87.2820μs 32.5149μs 30.7551 KOps/s 29.2007 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_view_pytree 75.8510μs 26.0114μs 38.4447 KOps/s 38.1571 KOps/s $\color{#35bf28}+0.75\%$
test_view_td 0.1462ms 37.0951μs 26.9577 KOps/s 25.9047 KOps/s $\color{#35bf28}+4.06\%$
test_unbind_pytree 69.7400μs 29.5432μs 33.8488 KOps/s 34.0868 KOps/s $\color{#d91a1a}-0.70\%$
test_unbind_td 0.3984ms 39.9484μs 25.0323 KOps/s 23.3058 KOps/s $\textbf{\color{#35bf28}+7.41\%}$
test_split_pytree 95.3670μs 29.5532μs 33.8373 KOps/s 33.5123 KOps/s $\color{#35bf28}+0.97\%$
test_split_td 0.5750ms 46.1988μs 21.6456 KOps/s 20.8847 KOps/s $\color{#35bf28}+3.64\%$
test_add_pytree 92.8130μs 35.1319μs 28.4642 KOps/s 28.2182 KOps/s $\color{#35bf28}+0.87\%$
test_add_td 0.1712ms 57.1227μs 17.5062 KOps/s 18.8279 KOps/s $\textbf{\color{#d91a1a}-7.02\%}$
test_compile_add_one_nested[tensordict-compile] 0.1465ms 62.2350μs 16.0681 KOps/s 15.1416 KOps/s $\textbf{\color{#35bf28}+6.12\%}$
test_compile_add_one_nested[tensordict-eager] 0.4417ms 0.1722ms 5.8070 KOps/s 5.6795 KOps/s $\color{#35bf28}+2.24\%$
test_compile_add_one_nested[pytree-compile] 0.1273ms 46.7733μs 21.3797 KOps/s 21.2581 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_one_nested[pytree-eager] 0.4624ms 0.1181ms 8.4684 KOps/s 8.5602 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_copy_nested[tensordict-compile] 71.6230μs 25.9205μs 38.5794 KOps/s 37.2353 KOps/s $\color{#35bf28}+3.61\%$
test_compile_copy_nested[tensordict-eager] 0.1405ms 59.2637μs 16.8737 KOps/s 16.8215 KOps/s $\color{#35bf28}+0.31\%$
test_compile_copy_nested[pytree-compile] 0.1532ms 76.1548μs 13.1312 KOps/s 12.9451 KOps/s $\color{#35bf28}+1.44\%$
test_compile_copy_nested[pytree-eager] 0.1253ms 65.6723μs 15.2271 KOps/s 14.9714 KOps/s $\color{#35bf28}+1.71\%$
test_compile_add_one_flat[tensordict-compile] 0.1885ms 0.1033ms 9.6804 KOps/s 9.0719 KOps/s $\textbf{\color{#35bf28}+6.71\%}$
test_compile_add_one_flat[tensordict-eager] 1.5166ms 0.2158ms 4.6347 KOps/s 4.5262 KOps/s $\color{#35bf28}+2.40\%$
test_compile_add_one_flat[tensorclass-compile] 0.1465ms 45.2231μs 22.1126 KOps/s 21.4195 KOps/s $\color{#35bf28}+3.24\%$
test_compile_add_one_flat[tensorclass-eager] 0.5627ms 67.9997μs 14.7059 KOps/s 14.5617 KOps/s $\color{#35bf28}+0.99\%$
test_compile_add_one_flat[pytree-compile] 0.1934ms 0.1030ms 9.7096 KOps/s 9.5241 KOps/s $\color{#35bf28}+1.95\%$
test_compile_add_one_flat[pytree-eager] 0.6404ms 0.2047ms 4.8841 KOps/s 5.0169 KOps/s $\color{#d91a1a}-2.65\%$
test_compile_add_self_flat[tensordict-eager] 0.3533ms 0.2301ms 4.3455 KOps/s 4.2060 KOps/s $\color{#35bf28}+3.31\%$
test_compile_add_self_flat[tensordict-compile] 0.2345ms 0.1078ms 9.2762 KOps/s 9.0807 KOps/s $\color{#35bf28}+2.15\%$
test_compile_add_self_flat[tensorclass-eager] 0.1979ms 64.3823μs 15.5322 KOps/s 15.4680 KOps/s $\color{#35bf28}+0.41\%$
test_compile_add_self_flat[tensorclass-compile] 0.1363ms 47.2324μs 21.1719 KOps/s 20.5152 KOps/s $\color{#35bf28}+3.20\%$
test_compile_add_self_flat[pytree-eager] 0.2958ms 0.1594ms 6.2721 KOps/s 6.3154 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_add_self_flat[pytree-compile] 0.2560ms 0.1053ms 9.4966 KOps/s 9.5110 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_copy_flat[tensordict-compile] 72.0640μs 21.6352μs 46.2210 KOps/s 45.5809 KOps/s $\color{#35bf28}+1.40\%$
test_compile_copy_flat[tensordict-eager] 0.1581ms 68.4247μs 14.6146 KOps/s 14.8756 KOps/s $\color{#d91a1a}-1.75\%$
test_compile_copy_flat[pytree-compile] 0.1526ms 77.9562μs 12.8277 KOps/s 12.7591 KOps/s $\color{#35bf28}+0.54\%$
test_compile_copy_flat[pytree-eager] 0.1561ms 67.7048μs 14.7700 KOps/s 14.8977 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_assign_and_add[tensordict-compile] 0.3334ms 0.2127ms 4.7004 KOps/s 4.5570 KOps/s $\color{#35bf28}+3.15\%$
test_compile_assign_and_add[tensordict-eager] 1.7520ms 1.3537ms 738.7425 Ops/s 744.4931 Ops/s $\color{#d91a1a}-0.77\%$
test_compile_assign_and_add[pytree-compile] 0.4083ms 0.2097ms 4.7691 KOps/s 4.7086 KOps/s $\color{#35bf28}+1.28\%$
test_compile_assign_and_add[pytree-eager] 2.3328ms 0.8032ms 1.2450 KOps/s 1.2943 KOps/s $\color{#d91a1a}-3.80\%$
test_compile_assign_and_add_stack[compile] 0.5996ms 0.4608ms 2.1702 KOps/s 2.1537 KOps/s $\color{#35bf28}+0.77\%$
test_compile_assign_and_add_stack[eager] 3.4484ms 2.5970ms 385.0622 Ops/s 375.1691 Ops/s $\color{#35bf28}+2.64\%$
test_compile_indexing[tensor-tensordict-compile] 0.1162ms 36.5463μs 27.3626 KOps/s 26.7914 KOps/s $\color{#35bf28}+2.13\%$
test_compile_indexing[tensor-tensordict-eager] 0.6852ms 32.3442μs 30.9174 KOps/s 28.9693 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.2570ms 30.4217μs 32.8712 KOps/s 33.1374 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1102ms 22.7118μs 44.0299 KOps/s 42.2884 KOps/s $\color{#35bf28}+4.12\%$
test_compile_indexing[tensor-pytree-compile] 84.5170μs 30.4660μs 32.8235 KOps/s 31.9026 KOps/s $\color{#35bf28}+2.89\%$
test_compile_indexing[tensor-pytree-eager] 66.0430μs 22.5887μs 44.2699 KOps/s 43.3258 KOps/s $\color{#35bf28}+2.18\%$
test_compile_indexing[slice-tensordict-compile] 0.1430ms 52.4413μs 19.0689 KOps/s 18.7381 KOps/s $\color{#35bf28}+1.77\%$
test_compile_indexing[slice-tensordict-eager] 0.6374ms 20.5529μs 48.6549 KOps/s 47.2549 KOps/s $\color{#35bf28}+2.96\%$
test_compile_indexing[slice-tensorclass-compile] 0.1041ms 45.0178μs 22.2135 KOps/s 21.0654 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_compile_indexing[slice-tensorclass-eager] 93.7450μs 18.4732μs 54.1324 KOps/s 54.0884 KOps/s $\color{#35bf28}+0.08\%$
test_compile_indexing[slice-pytree-compile] 0.1381ms 46.4916μs 21.5093 KOps/s 21.1938 KOps/s $\color{#35bf28}+1.49\%$
test_compile_indexing[slice-pytree-eager] 56.6450μs 18.3881μs 54.3829 KOps/s 54.6263 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_indexing[int-tensordict-compile] 0.1512ms 54.3348μs 18.4044 KOps/s 18.6327 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_indexing[int-tensordict-eager] 1.0410ms 20.6122μs 48.5151 KOps/s 47.1515 KOps/s $\color{#35bf28}+2.89\%$
test_compile_indexing[int-tensorclass-compile] 0.1023ms 45.7696μs 21.8486 KOps/s 21.7980 KOps/s $\color{#35bf28}+0.23\%$
test_compile_indexing[int-tensorclass-eager] 89.1060μs 18.2619μs 54.7588 KOps/s 54.2763 KOps/s $\color{#35bf28}+0.89\%$
test_compile_indexing[int-pytree-compile] 0.1325ms 45.5128μs 21.9718 KOps/s 21.4105 KOps/s $\color{#35bf28}+2.62\%$
test_compile_indexing[int-pytree-eager] 78.5660μs 18.2810μs 54.7016 KOps/s 54.7194 KOps/s $\color{#d91a1a}-0.03\%$
test_mod_add[eager] 0.1202ms 34.8663μs 28.6810 KOps/s 28.9083 KOps/s $\color{#d91a1a}-0.79\%$
test_mod_add[compile] 0.1085ms 49.1038μs 20.3650 KOps/s 20.1813 KOps/s $\color{#35bf28}+0.91\%$
test_mod_add[compile-overhead] 0.1088ms 48.0203μs 20.8245 KOps/s 19.9876 KOps/s $\color{#35bf28}+4.19\%$
test_mod_wrap[eager] 0.4448ms 0.2266ms 4.4130 KOps/s 4.3508 KOps/s $\color{#35bf28}+1.43\%$
test_mod_wrap[compile] 0.3974ms 0.2107ms 4.7465 KOps/s 4.6969 KOps/s $\color{#35bf28}+1.06\%$
test_mod_wrap[compile-overhead] 0.3657ms 0.2069ms 4.8342 KOps/s 4.7953 KOps/s $\color{#35bf28}+0.81\%$
test_mod_wrap_and_backward[eager] 15.7376ms 11.8328ms 84.5105 Ops/s 80.1187 Ops/s $\textbf{\color{#35bf28}+5.48\%}$
test_mod_wrap_and_backward[compile] 16.5967ms 12.3210ms 81.1619 Ops/s 74.9490 Ops/s $\textbf{\color{#35bf28}+8.29\%}$
test_mod_wrap_and_backward[compile-overhead] 18.1805ms 13.5701ms 73.6914 Ops/s 71.4221 Ops/s $\color{#35bf28}+3.18\%$
test_seq_add[eager] 0.2226ms 0.1136ms 8.8032 KOps/s 8.1771 KOps/s $\textbf{\color{#35bf28}+7.66\%}$
test_seq_add[compile] 0.1440ms 64.8232μs 15.4266 KOps/s 15.5829 KOps/s $\color{#d91a1a}-1.00\%$
test_seq_add[compile-overhead] 0.4492ms 64.2007μs 15.5761 KOps/s 15.8595 KOps/s $\color{#d91a1a}-1.79\%$
test_seq_wrap[eager] 0.6872ms 0.4469ms 2.2374 KOps/s 2.1792 KOps/s $\color{#35bf28}+2.67\%$
test_seq_wrap[compile] 0.3346ms 0.2309ms 4.3314 KOps/s 4.2717 KOps/s $\color{#35bf28}+1.40\%$
test_seq_wrap[compile-overhead] 0.3199ms 0.2266ms 4.4126 KOps/s 4.3157 KOps/s $\color{#35bf28}+2.24\%$
test_func_call_runtime[False-eager] 1.2083ms 0.5510ms 1.8148 KOps/s 1.7657 KOps/s $\color{#35bf28}+2.78\%$
test_func_call_runtime[False-compile] 0.5883ms 0.4234ms 2.3618 KOps/s 2.3261 KOps/s $\color{#35bf28}+1.54\%$
test_func_call_runtime[False-compile-overhead] 1.0161ms 0.4261ms 2.3471 KOps/s 2.3181 KOps/s $\color{#35bf28}+1.25\%$
test_func_call_runtime[True-eager] 1.2699ms 0.7567ms 1.3216 KOps/s 1.2870 KOps/s $\color{#35bf28}+2.69\%$
test_func_call_runtime[True-compile] 0.6089ms 0.4634ms 2.1581 KOps/s 2.1275 KOps/s $\color{#35bf28}+1.44\%$
test_func_call_runtime[True-compile-overhead] 0.6192ms 0.4670ms 2.1412 KOps/s 2.1124 KOps/s $\color{#35bf28}+1.36\%$
test_func_call_cm_runtime[False-eager] 0.7765ms 0.5395ms 1.8534 KOps/s 1.8098 KOps/s $\color{#35bf28}+2.41\%$
test_func_call_cm_runtime[False-compile] 0.5760ms 0.4232ms 2.3631 KOps/s 2.2618 KOps/s $\color{#35bf28}+4.48\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8754ms 0.4318ms 2.3156 KOps/s 2.3105 KOps/s $\color{#35bf28}+0.22\%$
test_func_call_cm_runtime[True-eager] 1.4646ms 0.9053ms 1.1046 KOps/s 1.0678 KOps/s $\color{#35bf28}+3.45\%$
test_func_call_cm_runtime[True-compile] 0.8647ms 0.4944ms 2.0225 KOps/s 1.9986 KOps/s $\color{#35bf28}+1.19\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6765ms 0.4914ms 2.0348 KOps/s 2.0073 KOps/s $\color{#35bf28}+1.37\%$
test_vmap_func_call_cm_runtime[eager] 2.6199ms 1.9101ms 523.5245 Ops/s 493.2721 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_vmap_func_call_cm_runtime[compile] 0.8988ms 0.5224ms 1.9144 KOps/s 1.9030 KOps/s $\color{#35bf28}+0.60\%$
test_vmap_func_call_cm_runtime[compile-overhead] 1.7922ms 0.5426ms 1.8430 KOps/s 1.8902 KOps/s $\color{#d91a1a}-2.50\%$
test_distributed 0.2773ms 0.1244ms 8.0357 KOps/s 7.6206 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_tdmodule 59.5210μs 25.3100μs 39.5100 KOps/s 37.6781 KOps/s $\color{#35bf28}+4.86\%$
test_tdmodule_dispatch 71.8840μs 45.8435μs 21.8134 KOps/s 21.3671 KOps/s $\color{#35bf28}+2.09\%$
test_tdseq 53.6900μs 27.0292μs 36.9971 KOps/s 34.0959 KOps/s $\textbf{\color{#35bf28}+8.51\%}$
test_tdseq_dispatch 0.2536ms 56.1094μs 17.8223 KOps/s 18.8315 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_instantiation_functorch 1.6646ms 1.5185ms 658.5397 Ops/s 643.2113 Ops/s $\color{#35bf28}+2.38\%$
test_exec_functorch 0.2725ms 0.1769ms 5.6516 KOps/s 5.5839 KOps/s $\color{#35bf28}+1.21\%$
test_exec_functional_call 0.4292ms 0.1737ms 5.7561 KOps/s 5.7074 KOps/s $\color{#35bf28}+0.85\%$
test_exec_td_decorator 0.5098ms 0.2305ms 4.3381 KOps/s 4.1929 KOps/s $\color{#35bf28}+3.46\%$
test_vmap_mlp_speed_decorator[True-True] 1.0263ms 0.6430ms 1.5553 KOps/s 1.4897 KOps/s $\color{#35bf28}+4.41\%$
test_vmap_mlp_speed_decorator[True-False] 1.2823ms 0.6458ms 1.5485 KOps/s 1.5092 KOps/s $\color{#35bf28}+2.60\%$
test_vmap_mlp_speed_decorator[False-True] 0.7721ms 0.5220ms 1.9157 KOps/s 1.8646 KOps/s $\color{#35bf28}+2.74\%$
test_vmap_mlp_speed_decorator[False-False] 0.9350ms 0.5258ms 1.9018 KOps/s 1.8705 KOps/s $\color{#35bf28}+1.67\%$
test_to_module_speed[True] 3.4624ms 1.3789ms 725.2149 Ops/s 739.3728 Ops/s $\color{#d91a1a}-1.91\%$
test_to_module_speed[False] 2.4883ms 1.3227ms 756.0391 Ops/s 772.1770 Ops/s $\color{#d91a1a}-2.09\%$
test_tc_init 80.9600μs 44.0899μs 22.6809 KOps/s 22.8421 KOps/s $\color{#d91a1a}-0.71\%$
test_tc_init_nested 0.1681ms 87.5027μs 11.4282 KOps/s 11.1830 KOps/s $\color{#35bf28}+2.19\%$
test_tc_first_layer_tensor 26.3290μs 1.5597μs 641.1519 KOps/s 647.3472 KOps/s $\color{#d91a1a}-0.96\%$
test_tc_first_layer_nontensor 23.9240μs 4.9252μs 203.0394 KOps/s 210.9157 KOps/s $\color{#d91a1a}-3.73\%$
test_tc_second_layer_tensor 56.7780μs 2.9520μs 338.7567 KOps/s 346.7443 KOps/s $\color{#d91a1a}-2.30\%$
test_tc_second_layer_nontensor 40.1540μs 6.2022μs 161.2319 KOps/s 164.7776 KOps/s $\color{#d91a1a}-2.15\%$
test_unbind 0.2596s 14.8540ms 67.3218 Ops/s 76.9575 Ops/s $\textbf{\color{#d91a1a}-12.52\%}$
test_full_like 22.3542ms 15.5618ms 64.2600 Ops/s 104.2200 Ops/s $\textbf{\color{#d91a1a}-38.34\%}$
test_zeros_like 13.6061ms 8.7499ms 114.2876 Ops/s 261.5769 Ops/s $\textbf{\color{#d91a1a}-56.31\%}$
test_ones_like 18.3728ms 8.4506ms 118.3350 Ops/s 237.3582 Ops/s $\textbf{\color{#d91a1a}-50.14\%}$
test_clone 15.3638ms 11.0814ms 90.2416 Ops/s 154.3623 Ops/s $\textbf{\color{#d91a1a}-41.54\%}$
test_squeeze 75.5410μs 12.4341μs 80.4243 KOps/s 80.1478 KOps/s $\color{#35bf28}+0.34\%$
test_unsqueeze 0.3576ms 92.3933μs 10.8233 KOps/s 10.5880 KOps/s $\color{#35bf28}+2.22\%$
test_split 0.3534ms 0.1972ms 5.0717 KOps/s 5.0724 KOps/s $\color{#d91a1a}-0.01\%$
test_permute 0.4059ms 0.2028ms 4.9310 KOps/s 4.8591 KOps/s $\color{#35bf28}+1.48\%$
test_stack 34.7807ms 29.9564ms 33.3819 Ops/s 34.6802 Ops/s $\color{#d91a1a}-3.74\%$
test_cat 37.0327ms 29.6439ms 33.7338 Ops/s 33.2671 Ops/s $\color{#35bf28}+1.40\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}57$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 31.8510μs 11.3157μs 88.3725 KOps/s 75.3863 KOps/s $\textbf{\color{#35bf28}+17.23\%}$
test_plain_set_stack_nested 30.4920μs 11.4672μs 87.2056 KOps/s 74.5022 KOps/s $\textbf{\color{#35bf28}+17.05\%}$
test_plain_set_nested_inplace 39.2120μs 12.3749μs 80.8085 KOps/s 69.7996 KOps/s $\textbf{\color{#35bf28}+15.77\%}$
test_plain_set_stack_nested_inplace 41.9220μs 12.4228μs 80.4972 KOps/s 69.8702 KOps/s $\textbf{\color{#35bf28}+15.21\%}$
test_items 29.6110μs 2.8938μs 345.5625 KOps/s 340.8484 KOps/s $\color{#35bf28}+1.38\%$
test_items_nested 0.4754ms 0.3698ms 2.7040 KOps/s 2.7919 KOps/s $\color{#d91a1a}-3.15\%$
test_items_nested_locked 0.5514ms 0.3664ms 2.7295 KOps/s 2.7758 KOps/s $\color{#d91a1a}-1.67\%$
test_items_nested_leaf 80.6740μs 58.6082μs 17.0625 KOps/s 17.2462 KOps/s $\color{#d91a1a}-1.07\%$
test_items_stack_nested 0.5432ms 0.3694ms 2.7068 KOps/s 2.7519 KOps/s $\color{#d91a1a}-1.64\%$
test_items_stack_nested_leaf 0.1421ms 59.5786μs 16.7845 KOps/s 16.6061 KOps/s $\color{#35bf28}+1.07\%$
test_items_stack_nested_locked 0.4272ms 0.3732ms 2.6798 KOps/s 2.7569 KOps/s $\color{#d91a1a}-2.80\%$
test_keys 31.7410μs 3.4551μs 289.4290 KOps/s 289.6850 KOps/s $\color{#d91a1a}-0.09\%$
test_keys_nested 0.1880ms 89.7302μs 11.1445 KOps/s 11.2424 KOps/s $\color{#d91a1a}-0.87\%$
test_keys_nested_locked 0.7197ms 95.4723μs 10.4742 KOps/s 10.6137 KOps/s $\color{#d91a1a}-1.31\%$
test_keys_nested_leaf 0.1039ms 80.1503μs 12.4766 KOps/s 12.6067 KOps/s $\color{#d91a1a}-1.03\%$
test_keys_stack_nested 0.1210ms 89.7763μs 11.1388 KOps/s 11.0615 KOps/s $\color{#35bf28}+0.70\%$
test_keys_stack_nested_leaf 0.1120ms 81.5276μs 12.2658 KOps/s 12.1811 KOps/s $\color{#35bf28}+0.70\%$
test_keys_stack_nested_locked 0.1401ms 96.1949μs 10.3956 KOps/s 10.4405 KOps/s $\color{#d91a1a}-0.43\%$
test_values 4.8068μs 0.8510μs 1.1750 MOps/s 1.1715 MOps/s $\color{#35bf28}+0.30\%$
test_values_nested 62.0330μs 38.4759μs 25.9903 KOps/s 26.3826 KOps/s $\color{#d91a1a}-1.49\%$
test_values_nested_locked 68.9630μs 39.7072μs 25.1843 KOps/s 25.3937 KOps/s $\color{#d91a1a}-0.82\%$
test_values_nested_leaf 65.5930μs 42.7004μs 23.4190 KOps/s 23.4367 KOps/s $\color{#d91a1a}-0.08\%$
test_values_stack_nested 0.1141ms 38.7264μs 25.8222 KOps/s 25.8768 KOps/s $\color{#d91a1a}-0.21\%$
test_values_stack_nested_leaf 71.7030μs 43.4220μs 23.0298 KOps/s 23.1424 KOps/s $\color{#d91a1a}-0.49\%$
test_values_stack_nested_locked 74.1340μs 40.2220μs 24.8620 KOps/s 25.0413 KOps/s $\color{#d91a1a}-0.72\%$
test_membership 1.8411μs 0.5050μs 1.9802 MOps/s 1.9508 MOps/s $\color{#35bf28}+1.50\%$
test_membership_nested 82.2335μs 2.0124μs 496.9107 KOps/s 485.7523 KOps/s $\color{#35bf28}+2.30\%$
test_membership_nested_leaf 91.7040μs 1.9863μs 503.4440 KOps/s 493.6802 KOps/s $\color{#35bf28}+1.98\%$
test_membership_stacked_nested 0.1246ms 2.0631μs 484.7190 KOps/s 483.9719 KOps/s $\color{#35bf28}+0.15\%$
test_membership_stacked_nested_leaf 32.2420μs 2.0709μs 482.8926 KOps/s 480.9361 KOps/s $\color{#35bf28}+0.41\%$
test_membership_nested_last 87.9040μs 3.0943μs 323.1793 KOps/s 320.5690 KOps/s $\color{#35bf28}+0.81\%$
test_membership_nested_leaf_last 32.9810μs 3.0468μs 328.2120 KOps/s 319.6751 KOps/s $\color{#35bf28}+2.67\%$
test_membership_stacked_nested_last 32.4220μs 3.0706μs 325.6700 KOps/s 284.3074 KOps/s $\textbf{\color{#35bf28}+14.55\%}$
test_membership_stacked_nested_leaf_last 27.2310μs 3.0975μs 322.8412 KOps/s 279.7878 KOps/s $\textbf{\color{#35bf28}+15.39\%}$
test_nested_getleaf 31.7310μs 6.1847μs 161.6900 KOps/s 161.9635 KOps/s $\color{#d91a1a}-0.17\%$
test_nested_get 34.5320μs 5.8895μs 169.7933 KOps/s 167.6757 KOps/s $\color{#35bf28}+1.26\%$
test_stacked_getleaf 35.9620μs 6.2019μs 161.2399 KOps/s 161.9406 KOps/s $\color{#d91a1a}-0.43\%$
test_stacked_get 25.7810μs 5.8477μs 171.0087 KOps/s 169.8844 KOps/s $\color{#35bf28}+0.66\%$
test_nested_getitemleaf 30.3810μs 6.5107μs 153.5935 KOps/s 154.7012 KOps/s $\color{#d91a1a}-0.72\%$
test_nested_getitem 38.2110μs 6.1740μs 161.9704 KOps/s 159.9489 KOps/s $\color{#35bf28}+1.26\%$
test_stacked_getitemleaf 35.9310μs 6.4927μs 154.0197 KOps/s 154.4529 KOps/s $\color{#d91a1a}-0.28\%$
test_stacked_getitem 31.7910μs 6.2033μs 161.2033 KOps/s 161.7786 KOps/s $\color{#d91a1a}-0.36\%$
test_lock_nested 9.7246ms 0.3812ms 2.6232 KOps/s 2.5678 KOps/s $\color{#35bf28}+2.16\%$
test_lock_stack_nested 0.4272ms 0.3443ms 2.9045 KOps/s 2.8380 KOps/s $\color{#35bf28}+2.34\%$
test_unlock_nested 0.6141ms 0.3139ms 3.1856 KOps/s 3.0811 KOps/s $\color{#35bf28}+3.39\%$
test_unlock_stack_nested 0.3669ms 0.2821ms 3.5451 KOps/s 3.4292 KOps/s $\color{#35bf28}+3.38\%$
test_flatten_speed 0.1096ms 74.5580μs 13.4124 KOps/s 13.2903 KOps/s $\color{#35bf28}+0.92\%$
test_unflatten_speed 0.5019ms 0.3182ms 3.1428 KOps/s 3.1086 KOps/s $\color{#35bf28}+1.10\%$
test_common_ops 1.6677ms 0.5759ms 1.7365 KOps/s 1.5031 KOps/s $\textbf{\color{#35bf28}+15.52\%}$
test_creation 0.1031ms 1.7934μs 557.6025 KOps/s 563.5323 KOps/s $\color{#d91a1a}-1.05\%$
test_creation_empty 42.5220μs 6.4736μs 154.4737 KOps/s 98.8154 KOps/s $\textbf{\color{#35bf28}+56.33\%}$
test_creation_nested_1 33.9220μs 8.2309μs 121.4927 KOps/s 85.1542 KOps/s $\textbf{\color{#35bf28}+42.67\%}$
test_creation_nested_2 37.8520μs 10.9119μs 91.6429 KOps/s 69.2482 KOps/s $\textbf{\color{#35bf28}+32.34\%}$
test_clone 0.1307ms 10.5735μs 94.5761 KOps/s 84.9186 KOps/s $\textbf{\color{#35bf28}+11.37\%}$
test_getitem[int] 1.6507ms 10.7455μs 93.0624 KOps/s 88.5935 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_getitem[slice_int] 0.2272ms 20.7417μs 48.2120 KOps/s 45.3889 KOps/s $\textbf{\color{#35bf28}+6.22\%}$
test_getitem[range] 0.1636ms 38.0423μs 26.2865 KOps/s 25.1052 KOps/s $\color{#35bf28}+4.71\%$
test_getitem[tuple] 0.1145ms 18.1341μs 55.1449 KOps/s 52.5374 KOps/s $\color{#35bf28}+4.96\%$
test_getitem[list] 0.2434ms 33.0416μs 30.2649 KOps/s 28.7907 KOps/s $\textbf{\color{#35bf28}+5.12\%}$
test_setitem_dim[int] 40.1420μs 19.2126μs 52.0493 KOps/s 48.6195 KOps/s $\textbf{\color{#35bf28}+7.05\%}$
test_setitem_dim[slice_int] 0.1688ms 38.4552μs 26.0043 KOps/s 25.2913 KOps/s $\color{#35bf28}+2.82\%$
test_setitem_dim[range] 89.9140μs 53.0266μs 18.8585 KOps/s 18.2555 KOps/s $\color{#35bf28}+3.30\%$
test_setitem_dim[tuple] 60.0120μs 32.4620μs 30.8053 KOps/s 28.8669 KOps/s $\textbf{\color{#35bf28}+6.71\%}$
test_setitem 86.6740μs 14.0773μs 71.0365 KOps/s 57.3608 KOps/s $\textbf{\color{#35bf28}+23.84\%}$
test_set 88.2340μs 13.5182μs 73.9744 KOps/s 59.4278 KOps/s $\textbf{\color{#35bf28}+24.48\%}$
test_set_shared 1.6510ms 0.1531ms 6.5323 KOps/s 6.4955 KOps/s $\color{#35bf28}+0.57\%$
test_update 0.9476ms 15.5148μs 64.4546 KOps/s 48.3077 KOps/s $\textbf{\color{#35bf28}+33.43\%}$
test_update_nested 97.8340μs 20.6720μs 48.3747 KOps/s 38.3572 KOps/s $\textbf{\color{#35bf28}+26.12\%}$
test_update__nested 1.5303ms 26.3033μs 38.0180 KOps/s 36.7945 KOps/s $\color{#35bf28}+3.33\%$
test_set_nested 0.1252ms 15.0658μs 66.3753 KOps/s 55.7674 KOps/s $\textbf{\color{#35bf28}+19.02\%}$
test_set_nested_new 0.1012ms 17.2572μs 57.9467 KOps/s 49.0732 KOps/s $\textbf{\color{#35bf28}+18.08\%}$
test_select 0.1050ms 29.0643μs 34.4065 KOps/s 30.4239 KOps/s $\textbf{\color{#35bf28}+13.09\%}$
test_select_nested 73.4530μs 43.8608μs 22.7994 KOps/s 22.6793 KOps/s $\color{#35bf28}+0.53\%$
test_exclude_nested 0.1020ms 63.5859μs 15.7267 KOps/s 15.6587 KOps/s $\color{#35bf28}+0.43\%$
test_empty[True] 0.4155ms 0.2988ms 3.3468 KOps/s 3.3449 KOps/s $\color{#35bf28}+0.06\%$
test_empty[False] 3.3881μs 0.8384μs 1.1928 MOps/s 1.2207 MOps/s $\color{#d91a1a}-2.29\%$
test_to 87.5840μs 56.6539μs 17.6510 KOps/s 18.1212 KOps/s $\color{#d91a1a}-2.59\%$
test_to_nonblocking 0.1946ms 48.0572μs 20.8086 KOps/s 20.9410 KOps/s $\color{#d91a1a}-0.63\%$
test_unbind_speed 0.8098ms 0.2306ms 4.3357 KOps/s 4.0919 KOps/s $\textbf{\color{#35bf28}+5.96\%}$
test_unbind_speed_stack0 0.3571ms 0.2365ms 4.2285 KOps/s 4.0284 KOps/s $\color{#35bf28}+4.97\%$
test_unbind_speed_stack1 94.8264ms 0.6691ms 1.4945 KOps/s 1.4628 KOps/s $\color{#35bf28}+2.16\%$
test_split 96.4935ms 1.5858ms 630.6122 Ops/s 607.3719 Ops/s $\color{#35bf28}+3.83\%$
test_chunk 97.6251ms 1.5901ms 628.8998 Ops/s 607.6914 Ops/s $\color{#35bf28}+3.49\%$
test_consolidate[False-None] 0.1003s 2.9628ms 337.5179 Ops/s 334.4361 Ops/s $\color{#35bf28}+0.92\%$
test_consolidate[default-None] 1.8300ms 1.6968ms 589.3493 Ops/s 563.9114 Ops/s $\color{#35bf28}+4.51\%$
test_consolidate[reduce-overhead-None] 1.9089ms 1.7383ms 575.2599 Ops/s 554.6691 Ops/s $\color{#35bf28}+3.71\%$
test_consolidate_njt[False-None] 6.9571ms 6.5694ms 152.2217 Ops/s 151.1552 Ops/s $\color{#35bf28}+0.71\%$
test_to[False-False-None] 1.8944ms 1.7374ms 575.5773 Ops/s 581.6273 Ops/s $\color{#d91a1a}-1.04\%$
test_to[True-False-None] 1.5306ms 1.3167ms 759.4900 Ops/s 728.3904 Ops/s $\color{#35bf28}+4.27\%$
test_to[within-False-None] 4.3139ms 4.1445ms 241.2854 Ops/s 172.8556 Ops/s $\textbf{\color{#35bf28}+39.59\%}$
test_to[True-default-None] 5.6165ms 5.3311ms 187.5773 Ops/s 180.4422 Ops/s $\color{#35bf28}+3.95\%$
test_to_njt[False-False-None] 7.1930ms 6.9669ms 143.5365 Ops/s 135.3720 Ops/s $\textbf{\color{#35bf28}+6.03\%}$
test_to_njt[True-False-None] 5.7499ms 5.5006ms 181.7982 Ops/s 175.6928 Ops/s $\color{#35bf28}+3.48\%$
test_to_njt[within-False-None] 12.5031ms 12.2968ms 81.3220 Ops/s 79.2058 Ops/s $\color{#35bf28}+2.67\%$
test_creation[device0] 0.4565ms 81.0088μs 12.3443 KOps/s 12.1070 KOps/s $\color{#35bf28}+1.96\%$
test_creation_from_tensor 0.4543ms 84.4569μs 11.8404 KOps/s 11.6555 KOps/s $\color{#35bf28}+1.59\%$
test_add_one[memmap_tensor0] 0.2449ms 6.9343μs 144.2113 KOps/s 135.1120 KOps/s $\textbf{\color{#35bf28}+6.73\%}$
test_contiguous[memmap_tensor0] 1.9611μs 0.4283μs 2.3348 MOps/s 2.3861 MOps/s $\color{#d91a1a}-2.15\%$
test_stack[memmap_tensor0] 0.1366ms 4.4544μs 224.4991 KOps/s 214.0227 KOps/s $\color{#35bf28}+4.90\%$
test_memmaptd_index 1.6213ms 0.2481ms 4.0311 KOps/s 3.7373 KOps/s $\textbf{\color{#35bf28}+7.86\%}$
test_memmaptd_index_astensor 0.5719ms 0.3086ms 3.2402 KOps/s 3.0030 KOps/s $\textbf{\color{#35bf28}+7.90\%}$
test_memmaptd_index_op 0.9681ms 0.5582ms 1.7915 KOps/s 1.5202 KOps/s $\textbf{\color{#35bf28}+17.85\%}$
test_serialize_model 0.4228s 0.1731s 5.7766 Ops/s 7.6183 Ops/s $\textbf{\color{#d91a1a}-24.17\%}$
test_serialize_model_pickle 1.3495s 1.2113s 0.8255 Ops/s 0.8222 Ops/s $\color{#35bf28}+0.41\%$
test_serialize_weights 0.1313s 0.1300s 7.6904 Ops/s 7.6259 Ops/s $\color{#35bf28}+0.85\%$
test_serialize_weights_returnearly 0.3256s 54.4506ms 18.3653 Ops/s 14.6641 Ops/s $\textbf{\color{#35bf28}+25.24\%}$
test_serialize_weights_pickle 1.3759s 1.2202s 0.8196 Ops/s 0.8239 Ops/s $\color{#d91a1a}-0.53\%$
test_reshape_pytree 0.1964ms 22.1436μs 45.1597 KOps/s 43.6208 KOps/s $\color{#35bf28}+3.53\%$
test_reshape_td 0.1542ms 26.4020μs 37.8760 KOps/s 35.2343 KOps/s $\textbf{\color{#35bf28}+7.50\%}$
test_view_pytree 0.1650ms 21.9874μs 45.4805 KOps/s 44.2272 KOps/s $\color{#35bf28}+2.83\%$
test_view_td 83.7140μs 29.2082μs 34.2370 KOps/s 32.7808 KOps/s $\color{#35bf28}+4.44\%$
test_unbind_pytree 0.1493ms 27.7435μs 36.0445 KOps/s 34.9062 KOps/s $\color{#35bf28}+3.26\%$
test_unbind_td 0.7781ms 36.0558μs 27.7348 KOps/s 26.3362 KOps/s $\textbf{\color{#35bf28}+5.31\%}$
test_split_pytree 0.1427ms 29.2243μs 34.2181 KOps/s 32.6320 KOps/s $\color{#35bf28}+4.86\%$
test_split_td 0.9746ms 37.6252μs 26.5779 KOps/s 25.1527 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_add_pytree 0.1712ms 34.6304μs 28.8764 KOps/s 26.6952 KOps/s $\textbf{\color{#35bf28}+8.17\%}$
test_add_td 0.2278ms 46.1149μs 21.6850 KOps/s 18.8283 KOps/s $\textbf{\color{#35bf28}+15.17\%}$
test_compile_add_one_nested[tensordict-compile] 0.2712ms 0.1214ms 8.2367 KOps/s 7.8274 KOps/s $\textbf{\color{#35bf28}+5.23\%}$
test_compile_add_one_nested[tensordict-eager] 0.3320ms 0.1327ms 7.5384 KOps/s 7.3935 KOps/s $\color{#35bf28}+1.96\%$
test_compile_add_one_nested[pytree-compile] 0.2512ms 97.4635μs 10.2602 KOps/s 10.0309 KOps/s $\color{#35bf28}+2.29\%$
test_compile_add_one_nested[pytree-eager] 0.3188ms 0.1497ms 6.6816 KOps/s 6.5412 KOps/s $\color{#35bf28}+2.15\%$
test_compile_copy_nested[tensordict-compile] 0.2041ms 23.7634μs 42.0815 KOps/s 44.7345 KOps/s $\textbf{\color{#d91a1a}-5.93\%}$
test_compile_copy_nested[tensordict-eager] 0.1506ms 29.3760μs 34.0414 KOps/s 33.1417 KOps/s $\color{#35bf28}+2.71\%$
test_compile_copy_nested[pytree-compile] 0.4750ms 64.1107μs 15.5980 KOps/s 15.1087 KOps/s $\color{#35bf28}+3.24\%$
test_compile_copy_nested[pytree-eager] 0.1789ms 49.4303μs 20.2305 KOps/s 20.0907 KOps/s $\color{#35bf28}+0.70\%$
test_compile_add_one_flat[tensordict-compile] 0.2956ms 0.1432ms 6.9809 KOps/s 7.0552 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_add_one_flat[tensordict-eager] 0.3515ms 0.2171ms 4.6056 KOps/s 4.6263 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_add_one_flat[tensorclass-compile] 0.2528ms 98.7116μs 10.1305 KOps/s 10.0313 KOps/s $\color{#35bf28}+0.99\%$
test_compile_add_one_flat[tensorclass-eager] 0.2362ms 56.0620μs 17.8374 KOps/s 17.4935 KOps/s $\color{#35bf28}+1.97\%$
test_compile_add_one_flat[pytree-compile] 0.2982ms 0.1364ms 7.3335 KOps/s 7.3983 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_add_one_flat[pytree-eager] 0.6615ms 0.4793ms 2.0865 KOps/s 2.0351 KOps/s $\color{#35bf28}+2.53\%$
test_compile_add_self_flat[tensordict-eager] 0.4176ms 0.2615ms 3.8239 KOps/s 3.8674 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_add_self_flat[tensordict-compile] 0.2964ms 0.1447ms 6.9104 KOps/s 7.0046 KOps/s $\color{#d91a1a}-1.34\%$
test_compile_add_self_flat[tensorclass-eager] 0.2323ms 68.0001μs 14.7059 KOps/s 14.7352 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_add_self_flat[tensorclass-compile] 0.2828ms 0.1005ms 9.9461 KOps/s 10.0343 KOps/s $\color{#d91a1a}-0.88\%$
test_compile_add_self_flat[pytree-eager] 0.5886ms 0.4011ms 2.4932 KOps/s 2.4705 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_self_flat[pytree-compile] 0.2750ms 0.1368ms 7.3085 KOps/s 7.4292 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_copy_flat[tensordict-compile] 0.1685ms 18.9437μs 52.7880 KOps/s 55.0293 KOps/s $\color{#d91a1a}-4.07\%$
test_compile_copy_flat[tensordict-eager] 57.7320μs 31.3901μs 31.8571 KOps/s 31.9784 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_copy_flat[pytree-compile] 0.2129ms 70.9265μs 14.0991 KOps/s 13.9613 KOps/s $\color{#35bf28}+0.99\%$
test_compile_copy_flat[pytree-eager] 74.8830μs 51.7770μs 19.3136 KOps/s 19.1485 KOps/s $\color{#35bf28}+0.86\%$
test_compile_assign_and_add[tensordict-compile] 1.6230ms 0.3911ms 2.5568 KOps/s 2.1738 KOps/s $\textbf{\color{#35bf28}+17.62\%}$
test_compile_assign_and_add[tensordict-eager] 2.9571ms 2.6681ms 374.8019 Ops/s 377.3664 Ops/s $\color{#d91a1a}-0.68\%$
test_compile_assign_and_add[pytree-compile] 1.5864ms 0.4372ms 2.2873 KOps/s 2.2324 KOps/s $\color{#35bf28}+2.46\%$
test_compile_assign_and_add[pytree-eager] 2.9609ms 2.6811ms 372.9765 Ops/s 370.7875 Ops/s $\color{#35bf28}+0.59\%$
test_compile_indexing[tensor-tensordict-compile] 0.3174ms 0.1222ms 8.1845 KOps/s 8.8013 KOps/s $\textbf{\color{#d91a1a}-7.01\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5966ms 85.1283μs 11.7470 KOps/s 11.9556 KOps/s $\color{#d91a1a}-1.75\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2996ms 0.1156ms 8.6529 KOps/s 9.3179 KOps/s $\textbf{\color{#d91a1a}-7.14\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.2742ms 73.0332μs 13.6924 KOps/s 14.6684 KOps/s $\textbf{\color{#d91a1a}-6.65\%}$
test_compile_indexing[tensor-pytree-compile] 0.2789ms 0.1155ms 8.6544 KOps/s 9.3835 KOps/s $\textbf{\color{#d91a1a}-7.77\%}$
test_compile_indexing[tensor-pytree-eager] 0.2333ms 73.0661μs 13.6862 KOps/s 14.6786 KOps/s $\textbf{\color{#d91a1a}-6.76\%}$
test_compile_indexing[slice-tensordict-compile] 0.2571ms 0.1075ms 9.3037 KOps/s 9.8047 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_compile_indexing[slice-tensordict-eager] 0.1383ms 17.0527μs 58.6419 KOps/s 55.3203 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_compile_indexing[slice-tensorclass-compile] 0.2897ms 98.5185μs 10.1504 KOps/s 10.1550 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_indexing[slice-tensorclass-eager] 0.1264ms 15.9922μs 62.5306 KOps/s 60.4110 KOps/s $\color{#35bf28}+3.51\%$
test_compile_indexing[slice-pytree-compile] 0.2734ms 0.1012ms 9.8783 KOps/s 10.1307 KOps/s $\color{#d91a1a}-2.49\%$
test_compile_indexing[slice-pytree-eager] 0.1763ms 15.8867μs 62.9459 KOps/s 60.4709 KOps/s $\color{#35bf28}+4.09\%$
test_compile_indexing[int-tensordict-compile] 0.2586ms 0.1045ms 9.5683 KOps/s 9.7016 KOps/s $\color{#d91a1a}-1.37\%$
test_compile_indexing[int-tensordict-eager] 0.5747ms 17.2699μs 57.9043 KOps/s 56.2081 KOps/s $\color{#35bf28}+3.02\%$
test_compile_indexing[int-tensorclass-compile] 0.2949ms 0.1025ms 9.7565 KOps/s 10.1175 KOps/s $\color{#d91a1a}-3.57\%$
test_compile_indexing[int-tensorclass-eager] 0.1754ms 15.7938μs 63.3159 KOps/s 61.1667 KOps/s $\color{#35bf28}+3.51\%$
test_compile_indexing[int-pytree-compile] 0.2814ms 0.1029ms 9.7175 KOps/s 9.8721 KOps/s $\color{#d91a1a}-1.57\%$
test_compile_indexing[int-pytree-eager] 0.2192ms 16.5157μs 60.5484 KOps/s 60.8818 KOps/s $\color{#d91a1a}-0.55\%$
test_mod_add[eager] 0.2261ms 36.3526μs 27.5084 KOps/s 25.0423 KOps/s $\textbf{\color{#35bf28}+9.85\%}$
test_mod_add[compile] 0.2876ms 80.7007μs 12.3915 KOps/s 12.2190 KOps/s $\color{#35bf28}+1.41\%$
test_mod_add[compile-overhead] 0.3389ms 0.1692ms 5.9087 KOps/s 5.5688 KOps/s $\textbf{\color{#35bf28}+6.10\%}$
test_mod_wrap[eager] 0.4030ms 0.2504ms 3.9941 KOps/s 3.9122 KOps/s $\color{#35bf28}+2.09\%$
test_mod_wrap[compile] 0.4194ms 0.2808ms 3.5614 KOps/s 3.4270 KOps/s $\color{#35bf28}+3.92\%$
test_mod_wrap[compile-overhead] 7.0319ms 3.7238ms 268.5403 Ops/s 274.3183 Ops/s $\color{#d91a1a}-2.11\%$
test_mod_wrap_and_backward[eager] 1.6870ms 1.3809ms 724.1893 Ops/s 677.7905 Ops/s $\textbf{\color{#35bf28}+6.85\%}$
test_mod_wrap_and_backward[compile] 1.6017ms 1.3882ms 720.3731 Ops/s 719.5755 Ops/s $\color{#35bf28}+0.11\%$
test_mod_wrap_and_backward[compile-overhead] 1.4774ms 0.9993ms 1.0007 KOps/s 945.3912 Ops/s $\textbf{\color{#35bf28}+5.85\%}$
test_seq_add[eager] 0.3133ms 0.1122ms 8.9157 KOps/s 8.2473 KOps/s $\textbf{\color{#35bf28}+8.10\%}$
test_seq_add[compile] 0.2864ms 88.9942μs 11.2367 KOps/s 10.7240 KOps/s $\color{#35bf28}+4.78\%$
test_seq_add[compile-overhead] 0.2815ms 0.1304ms 7.6664 KOps/s 7.3741 KOps/s $\color{#35bf28}+3.96\%$
test_seq_wrap[eager] 0.5946ms 0.4149ms 2.4104 KOps/s 2.2412 KOps/s $\textbf{\color{#35bf28}+7.55\%}$
test_seq_wrap[compile] 0.4483ms 0.2979ms 3.3567 KOps/s 3.1987 KOps/s $\color{#35bf28}+4.94\%$
test_seq_wrap[compile-overhead] 0.3992ms 0.2270ms 4.4062 KOps/s 4.2933 KOps/s $\color{#35bf28}+2.63\%$
test_func_call_runtime[False-eager] 0.9568ms 0.7546ms 1.3251 KOps/s 1.3200 KOps/s $\color{#35bf28}+0.39\%$
test_func_call_runtime[False-compile] 0.9312ms 0.7618ms 1.3127 KOps/s 1.3315 KOps/s $\color{#d91a1a}-1.41\%$
test_func_call_runtime[False-compile-overhead] 0.5302ms 0.3663ms 2.7301 KOps/s 2.6936 KOps/s $\color{#35bf28}+1.35\%$
test_func_call_runtime[True-eager] 1.0976ms 0.9089ms 1.1002 KOps/s 1.0909 KOps/s $\color{#35bf28}+0.85\%$
test_func_call_runtime[True-compile] 0.9402ms 0.7571ms 1.3208 KOps/s 1.2981 KOps/s $\color{#35bf28}+1.75\%$
test_func_call_runtime[True-compile-overhead] 0.5303ms 0.3881ms 2.5763 KOps/s 2.5547 KOps/s $\color{#35bf28}+0.85\%$
test_func_call_cm_runtime[False-eager] 0.9030ms 0.7398ms 1.3517 KOps/s 1.2797 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_func_call_cm_runtime[False-compile] 1.0472ms 0.7798ms 1.2824 KOps/s 1.3228 KOps/s $\color{#d91a1a}-3.05\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5050ms 0.3685ms 2.7135 KOps/s 2.6805 KOps/s $\color{#35bf28}+1.23\%$
test_func_call_cm_runtime[True-eager] 1.2492ms 1.0310ms 969.9701 Ops/s 979.6320 Ops/s $\color{#d91a1a}-0.99\%$
test_func_call_cm_runtime[True-compile] 0.9528ms 0.7856ms 1.2729 KOps/s 1.2408 KOps/s $\color{#35bf28}+2.59\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6030ms 0.4166ms 2.4001 KOps/s 2.3945 KOps/s $\color{#35bf28}+0.23\%$
test_vmap_func_call_cm_runtime[eager] 2.5862ms 2.1329ms 468.8490 Ops/s 468.7779 Ops/s $\color{#35bf28}+0.02\%$
test_vmap_func_call_cm_runtime[compile] 0.9720ms 0.8032ms 1.2450 KOps/s 1.2099 KOps/s $\color{#35bf28}+2.90\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5661ms 0.4179ms 2.3928 KOps/s 2.3739 KOps/s $\color{#35bf28}+0.79\%$
test_distributed 3.0002ms 0.1828ms 5.4709 KOps/s 8.4904 KOps/s $\textbf{\color{#d91a1a}-35.56\%}$
test_tdmodule 32.4010μs 18.5749μs 53.8362 KOps/s 46.4361 KOps/s $\textbf{\color{#35bf28}+15.94\%}$
test_tdmodule_dispatch 55.9530μs 33.5259μs 29.8277 KOps/s 26.1586 KOps/s $\textbf{\color{#35bf28}+14.03\%}$
test_tdseq 39.0010μs 19.3437μs 51.6965 KOps/s 44.8389 KOps/s $\textbf{\color{#35bf28}+15.29\%}$
test_tdseq_dispatch 0.1165ms 36.3313μs 27.5245 KOps/s 23.9436 KOps/s $\textbf{\color{#35bf28}+14.96\%}$
test_instantiation_functorch 1.9757ms 1.5627ms 639.9212 Ops/s 629.4835 Ops/s $\color{#35bf28}+1.66\%$
test_exec_functorch 0.2602ms 0.1454ms 6.8770 KOps/s 6.5825 KOps/s $\color{#35bf28}+4.47\%$
test_exec_functional_call 0.5508ms 0.1389ms 7.1998 KOps/s 6.9847 KOps/s $\color{#35bf28}+3.08\%$
test_exec_td_decorator 0.6014ms 0.1896ms 5.2733 KOps/s 5.2206 KOps/s $\color{#35bf28}+1.01\%$
test_vmap_mlp_speed_decorator[True-True] 0.8753ms 0.6884ms 1.4527 KOps/s 1.3789 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_vmap_mlp_speed_decorator[True-False] 1.1057ms 0.6864ms 1.4568 KOps/s 1.4273 KOps/s $\color{#35bf28}+2.07\%$
test_vmap_mlp_speed_decorator[False-True] 1.0279ms 0.6026ms 1.6594 KOps/s 1.6113 KOps/s $\color{#35bf28}+2.99\%$
test_vmap_mlp_speed_decorator[False-False] 1.0165ms 0.6045ms 1.6543 KOps/s 1.5833 KOps/s $\color{#35bf28}+4.48\%$
test_vmap_transformer_speed_decorator[True-True] 19.9332ms 19.4430ms 51.4323 Ops/s 51.4767 Ops/s $\color{#d91a1a}-0.09\%$
test_vmap_transformer_speed_decorator[True-False] 20.4896ms 19.5419ms 51.1722 Ops/s 51.6756 Ops/s $\color{#d91a1a}-0.97\%$
test_vmap_transformer_speed_decorator[False-True] 19.6752ms 19.4321ms 51.4613 Ops/s 51.8705 Ops/s $\color{#d91a1a}-0.79\%$
test_vmap_transformer_speed_decorator[False-False] 19.7756ms 19.4081ms 51.5248 Ops/s 51.8045 Ops/s $\color{#d91a1a}-0.54\%$
test_to_module_speed[True] 1.3817ms 0.9607ms 1.0409 KOps/s 1.0348 KOps/s $\color{#35bf28}+0.58\%$
test_to_module_speed[False] 1.3769ms 0.9538ms 1.0484 KOps/s 1.0485 KOps/s $\color{#d91a1a}-0.01\%$
test_tc_init 0.1566ms 33.7907μs 29.5940 KOps/s 26.0457 KOps/s $\textbf{\color{#35bf28}+13.62\%}$
test_tc_init_nested 0.1134ms 68.0980μs 14.6847 KOps/s 13.2756 KOps/s $\textbf{\color{#35bf28}+10.61\%}$
test_tc_first_layer_tensor 0.4049ms 0.8156μs 1.2261 MOps/s 1.4614 MOps/s $\textbf{\color{#d91a1a}-16.10\%}$
test_tc_first_layer_nontensor 0.1017ms 2.2412μs 446.1892 KOps/s 432.8250 KOps/s $\color{#35bf28}+3.09\%$
test_tc_second_layer_tensor 23.3510μs 1.4921μs 670.2109 KOps/s 712.8577 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_tc_second_layer_nontensor 0.4156ms 3.0091μs 332.3216 KOps/s 329.1568 KOps/s $\color{#35bf28}+0.96\%$
test_unbind 0.2240s 11.2624ms 88.7907 Ops/s 141.1256 Ops/s $\textbf{\color{#d91a1a}-37.08\%}$
test_full_like 11.4232ms 10.0634ms 99.3698 Ops/s 98.9624 Ops/s $\color{#35bf28}+0.41\%$
test_zeros_like 5.8725ms 4.3690ms 228.8854 Ops/s 229.0482 Ops/s $\color{#d91a1a}-0.07\%$
test_ones_like 5.8041ms 4.4558ms 224.4244 Ops/s 223.4283 Ops/s $\color{#35bf28}+0.45\%$
test_clone 8.1707ms 6.9824ms 143.2171 Ops/s 103.0967 Ops/s $\textbf{\color{#35bf28}+38.92\%}$
test_squeeze 0.1303ms 9.6911μs 103.1871 KOps/s 105.6238 KOps/s $\color{#d91a1a}-2.31\%$
test_unsqueeze 0.2329ms 71.6196μs 13.9627 KOps/s 13.2709 KOps/s $\textbf{\color{#35bf28}+5.21\%}$
test_split 0.3069ms 0.1570ms 6.3704 KOps/s 5.9938 KOps/s $\textbf{\color{#35bf28}+6.28\%}$
test_permute 0.3262ms 0.1777ms 5.6264 KOps/s 5.2931 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_stack 53.2448ms 51.5248ms 19.4081 Ops/s 19.1990 Ops/s $\color{#35bf28}+1.09\%$
test_cat 52.4337ms 51.5639ms 19.3934 Ops/s 19.2758 Ops/s $\color{#35bf28}+0.61\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants