Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] sorted keys, values and items #965

Merged
merged 4 commits into from
Sep 16, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Aug 13, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 13, 2024
ghstack-source-id: 624542b4f787547695d42d8808d1dafd89110a43
Pull Request resolved: #965
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 13, 2024
Copy link

github-actions bot commented Aug 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}33$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 52.1070μs 20.2651μs 49.3459 KOps/s 48.4511 KOps/s $\color{#35bf28}+1.85\%$
test_plain_set_stack_nested 48.5010μs 20.1813μs 49.5509 KOps/s 48.3204 KOps/s $\color{#35bf28}+2.55\%$
test_plain_set_nested_inplace 54.0100μs 21.7357μs 46.0073 KOps/s 44.6213 KOps/s $\color{#35bf28}+3.11\%$
test_plain_set_stack_nested_inplace 57.8080μs 22.0084μs 45.4372 KOps/s 45.2445 KOps/s $\color{#35bf28}+0.43\%$
test_items 19.2660μs 4.2463μs 235.4991 KOps/s 243.5332 KOps/s $\color{#d91a1a}-3.30\%$
test_items_nested 0.7413ms 0.3569ms 2.8017 KOps/s 3.0529 KOps/s $\textbf{\color{#d91a1a}-8.23\%}$
test_items_nested_locked 0.5952ms 0.3589ms 2.7862 KOps/s 3.0570 KOps/s $\textbf{\color{#d91a1a}-8.86\%}$
test_items_nested_leaf 0.1529ms 69.4880μs 14.3910 KOps/s 11.8163 KOps/s $\textbf{\color{#35bf28}+21.79\%}$
test_items_stack_nested 0.5305ms 0.3632ms 2.7536 KOps/s 3.0462 KOps/s $\textbf{\color{#d91a1a}-9.60\%}$
test_items_stack_nested_leaf 0.1394ms 71.5375μs 13.9787 KOps/s 11.5974 KOps/s $\textbf{\color{#35bf28}+20.53\%}$
test_items_stack_nested_locked 0.5276ms 0.3597ms 2.7803 KOps/s 3.0205 KOps/s $\textbf{\color{#d91a1a}-7.95\%}$
test_keys 41.6980μs 3.5664μs 280.3951 KOps/s 271.2732 KOps/s $\color{#35bf28}+3.36\%$
test_keys_nested 0.1725ms 98.3773μs 10.1649 KOps/s 10.0805 KOps/s $\color{#35bf28}+0.84\%$
test_keys_nested_locked 0.7405ms 0.1065ms 9.3933 KOps/s 9.8951 KOps/s $\textbf{\color{#d91a1a}-5.07\%}$
test_keys_nested_leaf 0.1950ms 82.1305μs 12.1757 KOps/s 12.0634 KOps/s $\color{#35bf28}+0.93\%$
test_keys_stack_nested 0.1728ms 97.5362μs 10.2526 KOps/s 10.2319 KOps/s $\color{#35bf28}+0.20\%$
test_keys_stack_nested_leaf 0.1378ms 80.2273μs 12.4646 KOps/s 12.2701 KOps/s $\color{#35bf28}+1.59\%$
test_keys_stack_nested_locked 0.1721ms 0.1029ms 9.7171 KOps/s 9.7691 KOps/s $\color{#d91a1a}-0.53\%$
test_values 47.6792μs 1.0895μs 917.8783 KOps/s 927.2756 KOps/s $\color{#d91a1a}-1.01\%$
test_values_nested 0.1247ms 71.9568μs 13.8972 KOps/s 20.9467 KOps/s $\textbf{\color{#d91a1a}-33.65\%}$
test_values_nested_locked 0.1214ms 71.7189μs 13.9433 KOps/s 21.0252 KOps/s $\textbf{\color{#d91a1a}-33.68\%}$
test_values_nested_leaf 0.1109ms 60.0198μs 16.6612 KOps/s 23.4613 KOps/s $\textbf{\color{#d91a1a}-28.98\%}$
test_values_stack_nested 0.1245ms 72.1928μs 13.8518 KOps/s 20.5420 KOps/s $\textbf{\color{#d91a1a}-32.57\%}$
test_values_stack_nested_leaf 0.1096ms 59.0801μs 16.9262 KOps/s 23.5476 KOps/s $\textbf{\color{#d91a1a}-28.12\%}$
test_values_stack_nested_locked 0.1274ms 72.5535μs 13.7829 KOps/s 19.8645 KOps/s $\textbf{\color{#d91a1a}-30.62\%}$
test_membership 14.3070μs 0.8707μs 1.1486 MOps/s 1.4468 MOps/s $\textbf{\color{#d91a1a}-20.61\%}$
test_membership_nested 23.0130μs 2.7689μs 361.1501 KOps/s 389.6639 KOps/s $\textbf{\color{#d91a1a}-7.32\%}$
test_membership_nested_leaf 24.9460μs 2.7480μs 363.9015 KOps/s 387.6513 KOps/s $\textbf{\color{#d91a1a}-6.13\%}$
test_membership_stacked_nested 28.4150μs 2.6870μs 372.1658 KOps/s 388.4083 KOps/s $\color{#d91a1a}-4.18\%$
test_membership_stacked_nested_leaf 42.9500μs 2.7256μs 366.8935 KOps/s 380.4529 KOps/s $\color{#d91a1a}-3.56\%$
test_membership_nested_last 60.8140μs 3.9469μs 253.3624 KOps/s 261.9272 KOps/s $\color{#d91a1a}-3.27\%$
test_membership_nested_leaf_last 37.3800μs 3.9629μs 252.3427 KOps/s 262.6095 KOps/s $\color{#d91a1a}-3.91\%$
test_membership_stacked_nested_last 71.9640μs 12.8393μs 77.8860 KOps/s 262.0625 KOps/s $\textbf{\color{#d91a1a}-70.28\%}$
test_membership_stacked_nested_leaf_last 44.8040μs 12.7880μs 78.1981 KOps/s 257.0873 KOps/s $\textbf{\color{#d91a1a}-69.58\%}$
test_nested_getleaf 52.5580μs 10.5965μs 94.3710 KOps/s 93.7338 KOps/s $\color{#35bf28}+0.68\%$
test_nested_get 52.2180μs 10.1245μs 98.7707 KOps/s 97.7713 KOps/s $\color{#35bf28}+1.02\%$
test_stacked_getleaf 41.4670μs 10.5904μs 94.4250 KOps/s 92.8832 KOps/s $\color{#35bf28}+1.66\%$
test_stacked_get 52.2980μs 10.0795μs 99.2114 KOps/s 97.5492 KOps/s $\color{#35bf28}+1.70\%$
test_nested_getitemleaf 39.7040μs 11.0446μs 90.5419 KOps/s 90.7219 KOps/s $\color{#d91a1a}-0.20\%$
test_nested_getitem 39.7950μs 10.1285μs 98.7315 KOps/s 99.3377 KOps/s $\color{#d91a1a}-0.61\%$
test_stacked_getitemleaf 51.7970μs 10.8400μs 92.2513 KOps/s 91.8072 KOps/s $\color{#35bf28}+0.48\%$
test_stacked_getitem 53.1800μs 10.1108μs 98.9041 KOps/s 98.3647 KOps/s $\color{#35bf28}+0.55\%$
test_lock_nested 1.3205ms 0.4771ms 2.0961 KOps/s 2.1000 KOps/s $\color{#d91a1a}-0.18\%$
test_lock_stack_nested 0.7470ms 0.4341ms 2.3034 KOps/s 2.2147 KOps/s $\color{#35bf28}+4.00\%$
test_unlock_nested 88.6445ms 0.4876ms 2.0509 KOps/s 2.4840 KOps/s $\textbf{\color{#d91a1a}-17.44\%}$
test_unlock_stack_nested 0.5808ms 0.3552ms 2.8153 KOps/s 2.6698 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_flatten_speed 0.1642ms 86.7802μs 11.5234 KOps/s 9.5517 KOps/s $\textbf{\color{#35bf28}+20.64\%}$
test_unflatten_speed 0.6186ms 0.4643ms 2.1537 KOps/s 2.1973 KOps/s $\color{#d91a1a}-1.99\%$
test_common_ops 3.7617ms 1.1154ms 896.5762 Ops/s 909.0850 Ops/s $\color{#d91a1a}-1.38\%$
test_creation 71.7540μs 2.0238μs 494.1199 KOps/s 479.2231 KOps/s $\color{#35bf28}+3.11\%$
test_creation_empty 67.8570μs 17.6015μs 56.8135 KOps/s 54.2937 KOps/s $\color{#35bf28}+4.64\%$
test_creation_nested_1 48.4210μs 20.8501μs 47.9615 KOps/s 46.3247 KOps/s $\color{#35bf28}+3.53\%$
test_creation_nested_2 65.1320μs 24.9004μs 40.1599 KOps/s 38.1693 KOps/s $\textbf{\color{#35bf28}+5.22\%}$
test_clone 77.0450μs 17.1879μs 58.1804 KOps/s 58.5099 KOps/s $\color{#d91a1a}-0.56\%$
test_getitem[int] 1.2164ms 16.6074μs 60.2141 KOps/s 59.9975 KOps/s $\color{#35bf28}+0.36\%$
test_getitem[slice_int] 0.1424ms 30.0006μs 33.3327 KOps/s 33.1397 KOps/s $\color{#35bf28}+0.58\%$
test_getitem[range] 0.5152ms 57.3652μs 17.4322 KOps/s 17.4032 KOps/s $\color{#35bf28}+0.17\%$
test_getitem[tuple] 0.1428ms 24.2141μs 41.2982 KOps/s 39.6001 KOps/s $\color{#35bf28}+4.29\%$
test_getitem[list] 0.1659ms 52.1363μs 19.1805 KOps/s 18.8183 KOps/s $\color{#35bf28}+1.92\%$
test_setitem_dim[int] 79.6790μs 32.5472μs 30.7246 KOps/s 30.0865 KOps/s $\color{#35bf28}+2.12\%$
test_setitem_dim[slice_int] 0.1264ms 60.2758μs 16.5904 KOps/s 16.5241 KOps/s $\color{#35bf28}+0.40\%$
test_setitem_dim[range] 0.1787ms 86.0610μs 11.6197 KOps/s 11.9405 KOps/s $\color{#d91a1a}-2.69\%$
test_setitem_dim[tuple] 0.1060ms 49.7380μs 20.1054 KOps/s 20.5775 KOps/s $\color{#d91a1a}-2.29\%$
test_setitem 0.1038ms 29.7175μs 33.6502 KOps/s 34.4779 KOps/s $\color{#d91a1a}-2.40\%$
test_set 75.5210μs 28.4406μs 35.1610 KOps/s 35.4663 KOps/s $\color{#d91a1a}-0.86\%$
test_set_shared 2.1197ms 0.2183ms 4.5806 KOps/s 4.7504 KOps/s $\color{#d91a1a}-3.58\%$
test_update 0.1560ms 35.6206μs 28.0736 KOps/s 27.6868 KOps/s $\color{#35bf28}+1.40\%$
test_update_nested 0.2259ms 45.9637μs 21.7563 KOps/s 21.5593 KOps/s $\color{#35bf28}+0.91\%$
test_update__nested 87.0830μs 33.8616μs 29.5320 KOps/s 29.3962 KOps/s $\color{#35bf28}+0.46\%$
test_set_nested 0.1976ms 31.2537μs 31.9962 KOps/s 32.6166 KOps/s $\color{#d91a1a}-1.90\%$
test_set_nested_new 0.1316ms 36.5090μs 27.3905 KOps/s 27.1305 KOps/s $\color{#35bf28}+0.96\%$
test_select 1.0376ms 53.7856μs 18.5923 KOps/s 18.5073 KOps/s $\color{#35bf28}+0.46\%$
test_select_nested 0.1251ms 60.5704μs 16.5097 KOps/s 16.7616 KOps/s $\color{#d91a1a}-1.50\%$
test_exclude_nested 0.1741ms 75.9350μs 13.1692 KOps/s 13.4567 KOps/s $\color{#d91a1a}-2.14\%$
test_empty[True] 0.5339ms 0.3188ms 3.1369 KOps/s 3.2046 KOps/s $\color{#d91a1a}-2.11\%$
test_empty[False] 11.2335μs 1.1895μs 840.7121 KOps/s 751.7772 KOps/s $\textbf{\color{#35bf28}+11.83\%}$
test_unbind_speed 0.3916ms 0.3001ms 3.3326 KOps/s 3.3270 KOps/s $\color{#35bf28}+0.17\%$
test_unbind_speed_stack0 0.5260ms 0.2864ms 3.4919 KOps/s 3.3618 KOps/s $\color{#35bf28}+3.87\%$
test_unbind_speed_stack1 90.5376ms 0.7764ms 1.2880 KOps/s 1.3242 KOps/s $\color{#d91a1a}-2.73\%$
test_split 89.1654ms 2.1311ms 469.2427 Ops/s 451.7181 Ops/s $\color{#35bf28}+3.88\%$
test_chunk 2.1928ms 1.9644ms 509.0590 Ops/s 492.2014 Ops/s $\color{#35bf28}+3.42\%$
test_creation[device0] 0.2766ms 0.1176ms 8.5036 KOps/s 8.5082 KOps/s $\color{#d91a1a}-0.05\%$
test_creation_from_tensor 3.5200ms 0.1190ms 8.4052 KOps/s 8.4482 KOps/s $\color{#d91a1a}-0.51\%$
test_add_one[memmap_tensor0] 0.4269ms 7.3110μs 136.7810 KOps/s 128.5164 KOps/s $\textbf{\color{#35bf28}+6.43\%}$
test_contiguous[memmap_tensor0] 33.6630μs 1.9108μs 523.3469 KOps/s 511.2483 KOps/s $\color{#35bf28}+2.37\%$
test_stack[memmap_tensor0] 96.9620μs 5.5550μs 180.0173 KOps/s 174.6505 KOps/s $\color{#35bf28}+3.07\%$
test_memmaptd_index 1.1974ms 0.3920ms 2.5513 KOps/s 1.8752 KOps/s $\textbf{\color{#35bf28}+36.05\%}$
test_memmaptd_index_astensor 0.7585ms 0.4654ms 2.1485 KOps/s 2.0166 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_memmaptd_index_op 2.6626ms 1.0000ms 999.9814 Ops/s 949.6013 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_serialize_model 0.1330s 0.1194s 8.3736 Ops/s 8.0370 Ops/s $\color{#35bf28}+4.19\%$
test_serialize_model_pickle 0.4714s 0.3990s 2.5065 Ops/s 2.5412 Ops/s $\color{#d91a1a}-1.37\%$
test_serialize_weights 0.1221s 0.1161s 8.6136 Ops/s 8.5162 Ops/s $\color{#35bf28}+1.14\%$
test_serialize_weights_returnearly 0.2515s 0.1745s 5.7298 Ops/s 5.6301 Ops/s $\color{#35bf28}+1.77\%$
test_serialize_weights_pickle 0.4543s 0.3996s 2.5024 Ops/s 2.4883 Ops/s $\color{#35bf28}+0.57\%$
test_serialize_weights_filesystem 0.1535s 0.1434s 6.9730 Ops/s 6.8997 Ops/s $\color{#35bf28}+1.06\%$
test_serialize_model_filesystem 0.1581s 0.1490s 6.7099 Ops/s 6.0093 Ops/s $\textbf{\color{#35bf28}+11.66\%}$
test_reshape_pytree 0.1053ms 37.7359μs 26.5000 KOps/s 25.1319 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_reshape_td 0.1361ms 44.1607μs 22.6446 KOps/s 21.4211 KOps/s $\textbf{\color{#35bf28}+5.71\%}$
test_view_pytree 84.5980μs 37.6748μs 26.5429 KOps/s 25.8141 KOps/s $\color{#35bf28}+2.82\%$
test_view_td 0.1468ms 51.1679μs 19.5435 KOps/s 18.6494 KOps/s $\color{#35bf28}+4.79\%$
test_unbind_pytree 79.4690μs 35.4625μs 28.1988 KOps/s 27.5962 KOps/s $\color{#35bf28}+2.18\%$
test_unbind_td 0.3301ms 44.4669μs 22.4886 KOps/s 21.9486 KOps/s $\color{#35bf28}+2.46\%$
test_split_pytree 76.9740μs 36.6402μs 27.2924 KOps/s 26.5461 KOps/s $\color{#35bf28}+2.81\%$
test_split_td 0.4800ms 55.5391μs 18.0053 KOps/s 17.4243 KOps/s $\color{#35bf28}+3.33\%$
test_add_pytree 0.1083ms 44.1334μs 22.6586 KOps/s 22.8129 KOps/s $\color{#d91a1a}-0.68\%$
test_add_td 0.2834ms 81.9953μs 12.1958 KOps/s 12.3944 KOps/s $\color{#d91a1a}-1.60\%$
test_compile_add_one_nested[tensordict-compile] 0.1178ms 56.0168μs 17.8518 KOps/s 17.2868 KOps/s $\color{#35bf28}+3.27\%$
test_compile_add_one_nested[tensordict-eager] 0.2698ms 0.1759ms 5.6837 KOps/s 5.2927 KOps/s $\textbf{\color{#35bf28}+7.39\%}$
test_compile_add_one_nested[pytree-compile] 0.1231ms 56.0164μs 17.8519 KOps/s 17.5862 KOps/s $\color{#35bf28}+1.51\%$
test_compile_add_one_nested[pytree-eager] 0.2187ms 0.1407ms 7.1067 KOps/s 7.0232 KOps/s $\color{#35bf28}+1.19\%$
test_compile_copy_nested[tensordict-compile] 45.1040μs 20.8359μs 47.9940 KOps/s 45.8144 KOps/s $\color{#35bf28}+4.76\%$
test_compile_copy_nested[tensordict-eager] 0.1177ms 67.8687μs 14.7343 KOps/s 14.9785 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_copy_nested[pytree-compile] 0.1316ms 74.5424μs 13.4152 KOps/s 13.3567 KOps/s $\color{#35bf28}+0.44\%$
test_compile_copy_nested[pytree-eager] 0.1551ms 67.9248μs 14.7222 KOps/s 14.6517 KOps/s $\color{#35bf28}+0.48\%$
test_compile_add_one_flat[tensordict-compile] 0.3804ms 0.1702ms 5.8758 KOps/s 5.8152 KOps/s $\color{#35bf28}+1.04\%$
test_compile_add_one_flat[tensordict-eager] 0.3636ms 0.1871ms 5.3438 KOps/s 5.3258 KOps/s $\color{#35bf28}+0.34\%$
test_compile_add_one_flat[tensorclass-compile] 0.1238ms 46.8338μs 21.3521 KOps/s 21.0813 KOps/s $\color{#35bf28}+1.28\%$
test_compile_add_one_flat[tensorclass-eager] 0.1800ms 68.9206μs 14.5095 KOps/s 14.6901 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_add_one_flat[pytree-compile] 0.2830ms 0.1723ms 5.8052 KOps/s 5.7540 KOps/s $\color{#35bf28}+0.89\%$
test_compile_add_one_flat[pytree-eager] 0.3676ms 0.2826ms 3.5382 KOps/s 3.4102 KOps/s $\color{#35bf28}+3.75\%$
test_compile_add_self_flat[tensordict-eager] 0.4323ms 0.1997ms 5.0076 KOps/s 4.9987 KOps/s $\color{#35bf28}+0.18\%$
test_compile_add_self_flat[tensordict-compile] 0.4054ms 0.1736ms 5.7600 KOps/s 5.7707 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_add_self_flat[tensorclass-eager] 0.1762ms 62.3071μs 16.0495 KOps/s 15.9812 KOps/s $\color{#35bf28}+0.43\%$
test_compile_add_self_flat[tensorclass-compile] 0.1151ms 46.1664μs 21.6608 KOps/s 20.5170 KOps/s $\textbf{\color{#35bf28}+5.57\%}$
test_compile_add_self_flat[pytree-eager] 0.4236ms 0.2281ms 4.3840 KOps/s 4.1845 KOps/s $\color{#35bf28}+4.77\%$
test_compile_add_self_flat[pytree-compile] 0.2995ms 0.1727ms 5.7902 KOps/s 5.6524 KOps/s $\color{#35bf28}+2.44\%$
test_compile_copy_flat[tensordict-compile] 0.1973ms 0.1011ms 9.8891 KOps/s 9.8954 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_copy_flat[tensordict-eager] 0.1146ms 58.0725μs 17.2199 KOps/s 17.2281 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_copy_flat[pytree-compile] 0.1744ms 78.3982μs 12.7554 KOps/s 12.6374 KOps/s $\color{#35bf28}+0.93\%$
test_compile_copy_flat[pytree-eager] 0.1591ms 69.1612μs 14.4590 KOps/s 14.3407 KOps/s $\color{#35bf28}+0.82\%$
test_compile_assign_and_add[tensordict-compile] 0.2954ms 0.1968ms 5.0823 KOps/s 5.1335 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_assign_and_add[tensordict-eager] 2.2246ms 1.6379ms 610.5278 Ops/s 609.2051 Ops/s $\color{#35bf28}+0.22\%$
test_compile_assign_and_add[pytree-compile] 0.3002ms 0.1952ms 5.1231 KOps/s 5.1333 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_assign_and_add[pytree-eager] 1.2504ms 1.0774ms 928.1424 Ops/s 897.2205 Ops/s $\color{#35bf28}+3.45\%$
test_compile_assign_and_add_stack[compile] 0.7225ms 0.4195ms 2.3835 KOps/s 2.3235 KOps/s $\color{#35bf28}+2.58\%$
test_compile_assign_and_add_stack[eager] 4.1130ms 3.6495ms 274.0134 Ops/s 262.9062 Ops/s $\color{#35bf28}+4.22\%$
test_compile_indexing[tensor-tensordict-compile] 81.1420μs 33.2383μs 30.0858 KOps/s 28.0215 KOps/s $\textbf{\color{#35bf28}+7.37\%}$
test_compile_indexing[tensor-tensordict-eager] 0.6185ms 47.0070μs 21.2734 KOps/s 20.4917 KOps/s $\color{#35bf28}+3.82\%$
test_compile_indexing[tensor-tensorclass-compile] 72.5060μs 28.4714μs 35.1229 KOps/s 32.8877 KOps/s $\textbf{\color{#35bf28}+6.80\%}$
test_compile_indexing[tensor-tensorclass-eager] 97.6330μs 28.4043μs 35.2060 KOps/s 35.1333 KOps/s $\color{#35bf28}+0.21\%$
test_compile_indexing[tensor-pytree-compile] 87.0730μs 28.4078μs 35.2016 KOps/s 32.9255 KOps/s $\textbf{\color{#35bf28}+6.91\%}$
test_compile_indexing[tensor-pytree-eager] 77.5350μs 27.5043μs 36.3580 KOps/s 35.2086 KOps/s $\color{#35bf28}+3.26\%$
test_compile_indexing[slice-tensordict-compile] 0.1753ms 72.1465μs 13.8607 KOps/s 13.4549 KOps/s $\color{#35bf28}+3.02\%$
test_compile_indexing[slice-tensordict-eager] 0.7960ms 27.6579μs 36.1560 KOps/s 36.6882 KOps/s $\color{#d91a1a}-1.45\%$
test_compile_indexing[slice-tensorclass-compile] 0.1317ms 65.3768μs 15.2959 KOps/s 14.3673 KOps/s $\textbf{\color{#35bf28}+6.46\%}$
test_compile_indexing[slice-tensorclass-eager] 83.8770μs 22.3164μs 44.8102 KOps/s 43.3810 KOps/s $\color{#35bf28}+3.29\%$
test_compile_indexing[slice-pytree-compile] 0.1235ms 65.0719μs 15.3676 KOps/s 14.6334 KOps/s $\textbf{\color{#35bf28}+5.02\%}$
test_compile_indexing[slice-pytree-eager] 54.1510μs 22.2942μs 44.8547 KOps/s 44.1390 KOps/s $\color{#35bf28}+1.62\%$
test_compile_indexing[int-tensordict-compile] 0.1738ms 71.1566μs 14.0535 KOps/s 13.4841 KOps/s $\color{#35bf28}+4.22\%$
test_compile_indexing[int-tensordict-eager] 0.8546ms 26.8807μs 37.2015 KOps/s 37.5383 KOps/s $\color{#d91a1a}-0.90\%$
test_compile_indexing[int-tensorclass-compile] 0.1721ms 65.5037μs 15.2663 KOps/s 14.5307 KOps/s $\textbf{\color{#35bf28}+5.06\%}$
test_compile_indexing[int-tensorclass-eager] 67.0850μs 22.2670μs 44.9094 KOps/s 43.9992 KOps/s $\color{#35bf28}+2.07\%$
test_compile_indexing[int-pytree-compile] 0.1372ms 65.3438μs 15.3037 KOps/s 14.5704 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_compile_indexing[int-pytree-eager] 60.4030μs 22.2924μs 44.8584 KOps/s 44.0997 KOps/s $\color{#35bf28}+1.72\%$
test_mod_add[eager] 83.7070μs 24.5135μs 40.7938 KOps/s 41.2293 KOps/s $\color{#d91a1a}-1.06\%$
test_mod_add[compile] 0.1089ms 37.7943μs 26.4590 KOps/s 25.4837 KOps/s $\color{#35bf28}+3.83\%$
test_mod_add[compile-overhead] 0.1144ms 37.8827μs 26.3973 KOps/s 25.4463 KOps/s $\color{#35bf28}+3.74\%$
test_mod_wrap[eager] 0.4107ms 0.2079ms 4.8096 KOps/s 4.7458 KOps/s $\color{#35bf28}+1.34\%$
test_mod_wrap[compile] 0.3456ms 0.2310ms 4.3292 KOps/s 4.2647 KOps/s $\color{#35bf28}+1.51\%$
test_mod_wrap[compile-overhead] 0.4175ms 0.2311ms 4.3265 KOps/s 4.2291 KOps/s $\color{#35bf28}+2.30\%$
test_mod_wrap_and_backward[eager] 16.3924ms 11.8442ms 84.4292 Ops/s 78.6401 Ops/s $\textbf{\color{#35bf28}+7.36\%}$
test_mod_wrap_and_backward[compile] 15.3735ms 12.2355ms 81.7292 Ops/s 81.7660 Ops/s $\color{#d91a1a}-0.04\%$
test_mod_wrap_and_backward[compile-overhead] 14.8030ms 11.5858ms 86.3123 Ops/s 82.6700 Ops/s $\color{#35bf28}+4.41\%$
test_seq_add[eager] 0.1899ms 87.5089μs 11.4274 KOps/s 11.3016 KOps/s $\color{#35bf28}+1.11\%$
test_seq_add[compile] 0.1955ms 64.3918μs 15.5299 KOps/s 15.2215 KOps/s $\color{#35bf28}+2.03\%$
test_seq_add[compile-overhead] 0.1407ms 62.6589μs 15.9594 KOps/s 15.7905 KOps/s $\color{#35bf28}+1.07\%$
test_seq_wrap[eager] 0.7362ms 0.3779ms 2.6463 KOps/s 2.5892 KOps/s $\color{#35bf28}+2.20\%$
test_seq_wrap[compile] 1.5015ms 0.2668ms 3.7484 KOps/s 3.5400 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_seq_wrap[compile-overhead] 1.3187ms 0.2648ms 3.7769 KOps/s 3.6332 KOps/s $\color{#35bf28}+3.96\%$
test_func_call_runtime[False-eager] 0.7509ms 0.5208ms 1.9201 KOps/s 1.8751 KOps/s $\color{#35bf28}+2.40\%$
test_func_call_runtime[False-compile] 0.7187ms 0.4990ms 2.0041 KOps/s 1.9357 KOps/s $\color{#35bf28}+3.53\%$
test_func_call_runtime[False-compile-overhead] 0.8338ms 0.4974ms 2.0107 KOps/s 1.9609 KOps/s $\color{#35bf28}+2.54\%$
test_func_call_runtime[True-eager] 1.2777ms 0.7428ms 1.3462 KOps/s 1.3252 KOps/s $\color{#35bf28}+1.59\%$
test_func_call_runtime[True-compile] 0.7096ms 0.5065ms 1.9745 KOps/s 1.9373 KOps/s $\color{#35bf28}+1.92\%$
test_func_call_runtime[True-compile-overhead] 0.6793ms 0.5076ms 1.9700 KOps/s 1.9237 KOps/s $\color{#35bf28}+2.41\%$
test_func_call_cm_runtime[False-eager] 0.9228ms 0.5213ms 1.9183 KOps/s 1.9056 KOps/s $\color{#35bf28}+0.67\%$
test_func_call_cm_runtime[False-compile] 1.0363ms 0.5042ms 1.9832 KOps/s 1.9688 KOps/s $\color{#35bf28}+0.73\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8377ms 0.5020ms 1.9920 KOps/s 1.9745 KOps/s $\color{#35bf28}+0.89\%$
test_func_call_cm_runtime[True-eager] 1.4415ms 0.8722ms 1.1466 KOps/s 1.1359 KOps/s $\color{#35bf28}+0.94\%$
test_func_call_cm_runtime[True-compile] 1.2872ms 0.7461ms 1.3402 KOps/s 1.3301 KOps/s $\color{#35bf28}+0.76\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0905ms 0.7386ms 1.3539 KOps/s 1.3226 KOps/s $\color{#35bf28}+2.37\%$
test_vmap_func_call_cm_runtime[eager] 2.5370ms 1.8411ms 543.1439 Ops/s 532.2561 Ops/s $\color{#35bf28}+2.05\%$
test_vmap_func_call_cm_runtime[compile] 3.0460ms 1.9009ms 526.0601 Ops/s 518.1085 Ops/s $\color{#35bf28}+1.53\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.7059ms 1.8910ms 528.8144 Ops/s 516.4826 Ops/s $\color{#35bf28}+2.39\%$
test_distributed 0.2246ms 0.1233ms 8.1120 KOps/s 7.8520 KOps/s $\color{#35bf28}+3.31\%$
test_tdmodule 44.6740μs 17.9250μs 55.7881 KOps/s 52.9178 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_tdmodule_dispatch 69.4490μs 35.8692μs 27.8791 KOps/s 27.6317 KOps/s $\color{#35bf28}+0.90\%$
test_tdseq 40.5870μs 20.0546μs 49.8638 KOps/s 48.7751 KOps/s $\color{#35bf28}+2.23\%$
test_tdseq_dispatch 59.2510μs 40.2278μs 24.8585 KOps/s 24.1891 KOps/s $\color{#35bf28}+2.77\%$
test_instantiation_functorch 2.6022ms 1.6088ms 621.5899 Ops/s 607.0915 Ops/s $\color{#35bf28}+2.39\%$
test_instantiation_td 2.3902ms 1.1824ms 845.7432 Ops/s 845.0174 Ops/s $\color{#35bf28}+0.09\%$
test_exec_functorch 0.3131ms 0.1825ms 5.4785 KOps/s 5.4267 KOps/s $\color{#35bf28}+0.95\%$
test_exec_functional_call 0.4205ms 0.1752ms 5.7062 KOps/s 5.7910 KOps/s $\color{#d91a1a}-1.46\%$
test_exec_td 0.3237ms 0.1703ms 5.8733 KOps/s 5.9338 KOps/s $\color{#d91a1a}-1.02\%$
test_exec_td_decorator 1.1356ms 0.2218ms 4.5082 KOps/s 4.4540 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_mlp_speed[True-True] 0.9472ms 0.6314ms 1.5839 KOps/s 1.5225 KOps/s $\color{#35bf28}+4.03\%$
test_vmap_mlp_speed[True-False] 0.9937ms 0.6256ms 1.5984 KOps/s 1.5507 KOps/s $\color{#35bf28}+3.08\%$
test_vmap_mlp_speed[False-True] 0.7499ms 0.4905ms 2.0385 KOps/s 1.9952 KOps/s $\color{#35bf28}+2.17\%$
test_vmap_mlp_speed[False-False] 0.7182ms 0.4932ms 2.0278 KOps/s 1.9938 KOps/s $\color{#35bf28}+1.70\%$
test_vmap_mlp_speed_decorator[True-True] 1.4837ms 0.6139ms 1.6289 KOps/s 1.5968 KOps/s $\color{#35bf28}+2.01\%$
test_vmap_mlp_speed_decorator[True-False] 0.9680ms 0.6215ms 1.6091 KOps/s 1.5786 KOps/s $\color{#35bf28}+1.93\%$
test_vmap_mlp_speed_decorator[False-True] 0.8887ms 0.5026ms 1.9896 KOps/s 1.9348 KOps/s $\color{#35bf28}+2.83\%$
test_vmap_mlp_speed_decorator[False-False] 0.7656ms 0.4979ms 2.0084 KOps/s 1.9406 KOps/s $\color{#35bf28}+3.49\%$
test_to_module_speed[True] 2.4631ms 1.3230ms 755.8759 Ops/s 772.4783 Ops/s $\color{#d91a1a}-2.15\%$
test_to_module_speed[False] 1.5041ms 1.2612ms 792.8863 Ops/s 789.8740 Ops/s $\color{#35bf28}+0.38\%$
test_tc_init 0.1070ms 44.6172μs 22.4129 KOps/s 21.9374 KOps/s $\color{#35bf28}+2.17\%$
test_tc_init_nested 0.1493ms 86.6694μs 11.5381 KOps/s 10.8065 KOps/s $\textbf{\color{#35bf28}+6.77\%}$
test_tc_first_layer_tensor 16.8310μs 1.5170μs 659.2010 KOps/s 659.4432 KOps/s $\color{#d91a1a}-0.04\%$
test_tc_first_layer_nontensor 36.8100μs 4.6456μs 215.2571 KOps/s 212.0520 KOps/s $\color{#35bf28}+1.51\%$
test_tc_second_layer_tensor 36.2380μs 2.7799μs 359.7295 KOps/s 357.3301 KOps/s $\color{#35bf28}+0.67\%$
test_tc_second_layer_nontensor 58.5900μs 5.9995μs 166.6805 KOps/s 166.0062 KOps/s $\color{#35bf28}+0.41\%$
test_unbind 0.4717s 12.9988ms 76.9304 Ops/s 69.1670 Ops/s $\textbf{\color{#35bf28}+11.22\%}$
test_full_like 8.0151ms 7.0196ms 142.4584 Ops/s 135.3939 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_zeros_like 4.4917ms 2.7484ms 363.8536 Ops/s 143.4206 Ops/s $\textbf{\color{#35bf28}+153.70\%}$
test_ones_like 3.8826ms 3.1139ms 321.1389 Ops/s 127.0874 Ops/s $\textbf{\color{#35bf28}+152.69\%}$
test_clone 6.1420ms 4.8679ms 205.4275 Ops/s 101.4672 Ops/s $\textbf{\color{#35bf28}+102.46\%}$
test_squeeze 59.6220μs 12.4520μs 80.3084 KOps/s 81.9033 KOps/s $\color{#d91a1a}-1.95\%$
test_unsqueeze 0.1632ms 93.3606μs 10.7112 KOps/s 10.8060 KOps/s $\color{#d91a1a}-0.88\%$
test_split 0.5307ms 0.1959ms 5.1040 KOps/s 5.0579 KOps/s $\color{#35bf28}+0.91\%$
test_permute 0.2892ms 0.2196ms 4.5536 KOps/s 4.4753 KOps/s $\color{#35bf28}+1.75\%$
test_stack 31.4858ms 24.7130ms 40.4646 Ops/s 38.1593 Ops/s $\textbf{\color{#35bf28}+6.04\%}$
test_cat 28.7882ms 24.5501ms 40.7330 Ops/s 38.1309 Ops/s $\textbf{\color{#35bf28}+6.82\%}$

Copy link

github-actions bot commented Aug 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}23$. Worsened: $\large\color{#d91a1a}21$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1530ms 14.1174μs 70.8347 KOps/s 65.6860 KOps/s $\textbf{\color{#35bf28}+7.84\%}$
test_plain_set_stack_nested 46.8810μs 14.2379μs 70.2351 KOps/s 65.9045 KOps/s $\textbf{\color{#35bf28}+6.57\%}$
test_plain_set_nested_inplace 53.4310μs 15.1019μs 66.2169 KOps/s 62.1809 KOps/s $\textbf{\color{#35bf28}+6.49\%}$
test_plain_set_stack_nested_inplace 40.6310μs 15.1907μs 65.8299 KOps/s 63.0828 KOps/s $\color{#35bf28}+4.35\%$
test_items 80.9710μs 2.8534μs 350.4531 KOps/s 350.0178 KOps/s $\color{#35bf28}+0.12\%$
test_items_nested 0.3760ms 0.3300ms 3.0307 KOps/s 3.1412 KOps/s $\color{#d91a1a}-3.52\%$
test_items_nested_locked 0.3740ms 0.3325ms 3.0077 KOps/s 3.1355 KOps/s $\color{#d91a1a}-4.08\%$
test_items_nested_leaf 79.7310μs 55.3675μs 18.0611 KOps/s 15.9746 KOps/s $\textbf{\color{#35bf28}+13.06\%}$
test_items_stack_nested 0.4019ms 0.3297ms 3.0329 KOps/s 3.1683 KOps/s $\color{#d91a1a}-4.27\%$
test_items_stack_nested_leaf 0.1163ms 57.4384μs 17.4100 KOps/s 15.5022 KOps/s $\textbf{\color{#35bf28}+12.31\%}$
test_items_stack_nested_locked 0.4009ms 0.3344ms 2.9904 KOps/s 3.1544 KOps/s $\textbf{\color{#d91a1a}-5.20\%}$
test_keys 37.0000μs 3.6434μs 274.4659 KOps/s 296.7852 KOps/s $\textbf{\color{#d91a1a}-7.52\%}$
test_keys_nested 88.6620μs 56.0921μs 17.8278 KOps/s 18.4239 KOps/s $\color{#d91a1a}-3.24\%$
test_keys_nested_locked 2.5425ms 61.9965μs 16.1299 KOps/s 16.5564 KOps/s $\color{#d91a1a}-2.58\%$
test_keys_nested_leaf 82.0210μs 47.4998μs 21.0527 KOps/s 21.5749 KOps/s $\color{#d91a1a}-2.42\%$
test_keys_stack_nested 82.2920μs 56.3919μs 17.7330 KOps/s 18.0807 KOps/s $\color{#d91a1a}-1.92\%$
test_keys_stack_nested_leaf 82.2110μs 47.8557μs 20.8962 KOps/s 21.1712 KOps/s $\color{#d91a1a}-1.30\%$
test_keys_stack_nested_locked 88.9620μs 61.5938μs 16.2354 KOps/s 16.7035 KOps/s $\color{#d91a1a}-2.80\%$
test_values 4.8783μs 0.8549μs 1.1697 MOps/s 1.2206 MOps/s $\color{#d91a1a}-4.17\%$
test_values_nested 82.2020μs 40.8190μs 24.4984 KOps/s 36.3942 KOps/s $\textbf{\color{#d91a1a}-32.69\%}$
test_values_nested_locked 74.4420μs 42.5587μs 23.4970 KOps/s 34.2136 KOps/s $\textbf{\color{#d91a1a}-31.32\%}$
test_values_nested_leaf 60.9010μs 35.5420μs 28.1358 KOps/s 41.4925 KOps/s $\textbf{\color{#d91a1a}-32.19\%}$
test_values_stack_nested 76.6720μs 42.2078μs 23.6923 KOps/s 35.0787 KOps/s $\textbf{\color{#d91a1a}-32.46\%}$
test_values_stack_nested_leaf 72.2920μs 36.4040μs 27.4695 KOps/s 40.1749 KOps/s $\textbf{\color{#d91a1a}-31.63\%}$
test_values_stack_nested_locked 78.3520μs 43.7995μs 22.8313 KOps/s 33.2393 KOps/s $\textbf{\color{#d91a1a}-31.31\%}$
test_membership 1.8835μs 0.5111μs 1.9566 MOps/s 2.0740 MOps/s $\textbf{\color{#d91a1a}-5.66\%}$
test_membership_nested 17.3755μs 1.8811μs 531.5937 KOps/s 564.5910 KOps/s $\textbf{\color{#d91a1a}-5.84\%}$
test_membership_nested_leaf 13.3203μs 1.8409μs 543.2226 KOps/s 587.4319 KOps/s $\textbf{\color{#d91a1a}-7.53\%}$
test_membership_stacked_nested 34.5610μs 1.8982μs 526.8119 KOps/s 554.1868 KOps/s $\color{#d91a1a}-4.94\%$
test_membership_stacked_nested_leaf 30.2810μs 1.9333μs 517.2402 KOps/s 547.5074 KOps/s $\textbf{\color{#d91a1a}-5.53\%}$
test_membership_nested_last 29.7100μs 2.6956μs 370.9784 KOps/s 381.2566 KOps/s $\color{#d91a1a}-2.70\%$
test_membership_nested_leaf_last 39.5910μs 2.7161μs 368.1764 KOps/s 384.8899 KOps/s $\color{#d91a1a}-4.34\%$
test_membership_stacked_nested_last 27.3800μs 2.6943μs 371.1495 KOps/s 336.8965 KOps/s $\textbf{\color{#35bf28}+10.17\%}$
test_membership_stacked_nested_leaf_last 25.0800μs 2.7394μs 365.0398 KOps/s 336.1980 KOps/s $\textbf{\color{#35bf28}+8.58\%}$
test_nested_getleaf 40.1300μs 6.0786μs 164.5107 KOps/s 162.9850 KOps/s $\color{#35bf28}+0.94\%$
test_nested_get 31.2800μs 5.7761μs 173.1263 KOps/s 172.7139 KOps/s $\color{#35bf28}+0.24\%$
test_stacked_getleaf 30.5010μs 5.9963μs 166.7699 KOps/s 165.0655 KOps/s $\color{#35bf28}+1.03\%$
test_stacked_get 46.6910μs 5.7061μs 175.2500 KOps/s 172.5914 KOps/s $\color{#35bf28}+1.54\%$
test_nested_getitemleaf 31.8210μs 6.0891μs 164.2277 KOps/s 163.8578 KOps/s $\color{#35bf28}+0.23\%$
test_nested_getitem 27.2110μs 5.7724μs 173.2396 KOps/s 172.7786 KOps/s $\color{#35bf28}+0.27\%$
test_stacked_getitemleaf 36.6110μs 6.0795μs 164.4880 KOps/s 165.2514 KOps/s $\color{#d91a1a}-0.46\%$
test_stacked_getitem 44.0110μs 5.8025μs 172.3385 KOps/s 174.1208 KOps/s $\color{#d91a1a}-1.02\%$
test_lock_nested 5.2305ms 0.4227ms 2.3656 KOps/s 2.3722 KOps/s $\color{#d91a1a}-0.28\%$
test_lock_stack_nested 0.4295ms 0.3833ms 2.6089 KOps/s 2.5945 KOps/s $\color{#35bf28}+0.55\%$
test_unlock_nested 0.7579ms 0.3581ms 2.7924 KOps/s 2.7742 KOps/s $\color{#35bf28}+0.65\%$
test_unlock_stack_nested 0.4128ms 0.3242ms 3.0840 KOps/s 3.0747 KOps/s $\color{#35bf28}+0.31\%$
test_flatten_speed 0.1476ms 68.7699μs 14.5413 KOps/s 12.4736 KOps/s $\textbf{\color{#35bf28}+16.58\%}$
test_unflatten_speed 0.3256ms 0.2826ms 3.5381 KOps/s 3.4959 KOps/s $\color{#35bf28}+1.21\%$
test_common_ops 1.6649ms 1.3847ms 722.1852 Ops/s 758.9500 Ops/s $\color{#d91a1a}-4.84\%$
test_creation 20.3710μs 1.4427μs 693.1354 KOps/s 683.1425 KOps/s $\color{#35bf28}+1.46\%$
test_creation_empty 43.1210μs 15.7091μs 63.6575 KOps/s 56.8434 KOps/s $\textbf{\color{#35bf28}+11.99\%}$
test_creation_nested_1 48.0510μs 17.3989μs 57.4749 KOps/s 51.6559 KOps/s $\textbf{\color{#35bf28}+11.26\%}$
test_creation_nested_2 48.4500μs 20.0085μs 49.9786 KOps/s 45.5805 KOps/s $\textbf{\color{#35bf28}+9.65\%}$
test_clone 72.0910μs 30.2425μs 33.0661 KOps/s 33.2070 KOps/s $\color{#d91a1a}-0.42\%$
test_getitem[int] 1.3162ms 16.1818μs 61.7977 KOps/s 59.2013 KOps/s $\color{#35bf28}+4.39\%$
test_getitem[slice_int] 0.1236ms 28.4910μs 35.0988 KOps/s 35.2161 KOps/s $\color{#d91a1a}-0.33\%$
test_getitem[range] 0.2448ms 0.1116ms 8.9587 KOps/s 9.0053 KOps/s $\color{#d91a1a}-0.52\%$
test_getitem[tuple] 0.1230ms 23.4351μs 42.6711 KOps/s 40.7723 KOps/s $\color{#35bf28}+4.66\%$
test_getitem[list] 0.1979ms 0.1072ms 9.3287 KOps/s 9.9404 KOps/s $\textbf{\color{#d91a1a}-6.15\%}$
test_setitem_dim[int] 82.6510μs 49.4681μs 20.2150 KOps/s 21.6048 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_setitem_dim[slice_int] 0.1192ms 72.1012μs 13.8694 KOps/s 14.4195 KOps/s $\color{#d91a1a}-3.81\%$
test_setitem_dim[range] 0.1654ms 0.1373ms 7.2855 KOps/s 7.6674 KOps/s $\color{#d91a1a}-4.98\%$
test_setitem_dim[tuple] 88.6520μs 65.7101μs 15.2184 KOps/s 15.9302 KOps/s $\color{#d91a1a}-4.47\%$
test_setitem 98.6920μs 47.5761μs 21.0190 KOps/s 22.4722 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_set 95.0010μs 46.2677μs 21.6134 KOps/s 22.6408 KOps/s $\color{#d91a1a}-4.54\%$
test_set_shared 0.3683ms 53.7767μs 18.5954 KOps/s 18.6802 KOps/s $\color{#d91a1a}-0.45\%$
test_update 92.4420μs 50.8639μs 19.6603 KOps/s 18.6739 KOps/s $\textbf{\color{#35bf28}+5.28\%}$
test_update_nested 0.1178ms 58.9108μs 16.9748 KOps/s 16.5390 KOps/s $\color{#35bf28}+2.64\%$
test_update__nested 0.1101ms 67.3624μs 14.8451 KOps/s 16.2382 KOps/s $\textbf{\color{#d91a1a}-8.58\%}$
test_set_nested 93.7010μs 48.7797μs 20.5003 KOps/s 21.1995 KOps/s $\color{#d91a1a}-3.30\%$
test_set_nested_new 99.5220μs 52.4510μs 19.0654 KOps/s 19.2755 KOps/s $\color{#d91a1a}-1.09\%$
test_select 0.1166ms 65.3729μs 15.2969 KOps/s 15.0923 KOps/s $\color{#35bf28}+1.36\%$
test_select_nested 70.3210μs 41.8204μs 23.9118 KOps/s 23.5627 KOps/s $\color{#35bf28}+1.48\%$
test_exclude_nested 99.5320μs 58.0539μs 17.2254 KOps/s 16.8324 KOps/s $\color{#35bf28}+2.33\%$
test_empty[True] 0.2998ms 0.2435ms 4.1064 KOps/s 4.0581 KOps/s $\color{#35bf28}+1.19\%$
test_empty[False] 3.5241μs 0.7433μs 1.3453 MOps/s 1.3452 MOps/s $+0.01\%$
test_to 56.9810μs 25.1881μs 39.7013 KOps/s 39.3460 KOps/s $\color{#35bf28}+0.90\%$
test_to_nonblocking 69.2620μs 24.0306μs 41.6135 KOps/s 39.5710 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_unbind_speed 1.4188ms 0.2854ms 3.5039 KOps/s 3.5808 KOps/s $\color{#d91a1a}-2.15\%$
test_unbind_speed_stack0 0.4366ms 0.2797ms 3.5751 KOps/s 3.4987 KOps/s $\color{#35bf28}+2.18\%$
test_unbind_speed_stack1 93.5476ms 0.7189ms 1.3911 KOps/s 1.3894 KOps/s $\color{#35bf28}+0.12\%$
test_split 94.5612ms 2.1815ms 458.3995 Ops/s 444.3347 Ops/s $\color{#35bf28}+3.17\%$
test_chunk 96.7025ms 2.1824ms 458.2215 Ops/s 440.6806 Ops/s $\color{#35bf28}+3.98\%$
test_creation[device0] 0.3339ms 0.1268ms 7.8853 KOps/s 7.6084 KOps/s $\color{#35bf28}+3.64\%$
test_creation_from_tensor 0.3977ms 0.1298ms 7.7070 KOps/s 7.5201 KOps/s $\color{#35bf28}+2.49\%$
test_add_one[memmap_tensor0] 0.2415ms 9.0500μs 110.4977 KOps/s 113.0787 KOps/s $\color{#d91a1a}-2.28\%$
test_contiguous[memmap_tensor0] 30.5510μs 2.2177μs 450.9241 KOps/s 448.7359 KOps/s $\color{#35bf28}+0.49\%$
test_stack[memmap_tensor0] 31.5710μs 7.0017μs 142.8224 KOps/s 147.9763 KOps/s $\color{#d91a1a}-3.48\%$
test_memmaptd_index 1.2102ms 0.4270ms 2.3417 KOps/s 2.3262 KOps/s $\color{#35bf28}+0.67\%$
test_memmaptd_index_astensor 0.9770ms 0.4839ms 2.0667 KOps/s 2.0484 KOps/s $\color{#35bf28}+0.89\%$
test_memmaptd_index_op 1.4175ms 1.0480ms 954.1843 Ops/s 941.4203 Ops/s $\color{#35bf28}+1.36\%$
test_serialize_model 0.1303s 0.1294s 7.7273 Ops/s 7.6995 Ops/s $\color{#35bf28}+0.36\%$
test_serialize_model_pickle 1.3502s 1.2125s 0.8248 Ops/s 0.8214 Ops/s $\color{#35bf28}+0.41\%$
test_serialize_weights 0.2224s 0.1421s 7.0385 Ops/s 7.7314 Ops/s $\textbf{\color{#d91a1a}-8.96\%}$
test_serialize_weights_returnearly 0.2229s 55.5951ms 17.9872 Ops/s 16.3361 Ops/s $\textbf{\color{#35bf28}+10.11\%}$
test_serialize_weights_pickle 1.3723s 1.2164s 0.8221 Ops/s 0.8220 Ops/s $\color{#35bf28}+0.01\%$
test_reshape_pytree 79.5020μs 36.7551μs 27.2071 KOps/s 27.1076 KOps/s $\color{#35bf28}+0.37\%$
test_reshape_td 97.2120μs 43.6764μs 22.8957 KOps/s 23.4041 KOps/s $\color{#d91a1a}-2.17\%$
test_view_pytree 65.2020μs 36.0028μs 27.7756 KOps/s 26.7105 KOps/s $\color{#35bf28}+3.99\%$
test_view_td 95.4920μs 47.7966μs 20.9220 KOps/s 21.0489 KOps/s $\color{#d91a1a}-0.60\%$
test_unbind_pytree 76.5710μs 35.2698μs 28.3528 KOps/s 27.9371 KOps/s $\color{#35bf28}+1.49\%$
test_unbind_td 0.5490ms 43.7074μs 22.8794 KOps/s 22.7683 KOps/s $\color{#35bf28}+0.49\%$
test_split_pytree 0.6156ms 47.2510μs 21.1636 KOps/s 20.3941 KOps/s $\color{#35bf28}+3.77\%$
test_split_td 0.1487ms 55.4134μs 18.0462 KOps/s 17.1814 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_add_pytree 0.1066ms 58.1550μs 17.1954 KOps/s 17.0722 KOps/s $\color{#35bf28}+0.72\%$
test_add_td 0.1605ms 92.6548μs 10.7927 KOps/s 10.4257 KOps/s $\color{#35bf28}+3.52\%$
test_compile_add_one_nested[tensordict-compile] 0.4268ms 0.2140ms 4.6738 KOps/s 4.5507 KOps/s $\color{#35bf28}+2.70\%$
test_compile_add_one_nested[tensordict-eager] 0.2271ms 0.1526ms 6.5520 KOps/s 6.2440 KOps/s $\color{#35bf28}+4.93\%$
test_compile_add_one_nested[pytree-compile] 0.2191ms 0.1480ms 6.7573 KOps/s 6.8466 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_add_one_nested[pytree-eager] 0.2420ms 0.1864ms 5.3641 KOps/s 5.1837 KOps/s $\color{#35bf28}+3.48\%$
test_compile_copy_nested[tensordict-compile] 64.9920μs 21.9681μs 45.5206 KOps/s 46.4973 KOps/s $\color{#d91a1a}-2.10\%$
test_compile_copy_nested[tensordict-eager] 84.3520μs 44.5834μs 22.4299 KOps/s 22.6021 KOps/s $\color{#d91a1a}-0.76\%$
test_compile_copy_nested[pytree-compile] 0.1976ms 63.5723μs 15.7301 KOps/s 15.8129 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_copy_nested[pytree-eager] 88.0520μs 49.6266μs 20.1505 KOps/s 20.4114 KOps/s $\color{#d91a1a}-1.28\%$
test_compile_add_one_flat[tensordict-compile] 0.3605ms 0.3265ms 3.0624 KOps/s 3.1030 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_add_one_flat[tensordict-eager] 0.3350ms 0.2108ms 4.7435 KOps/s 4.7771 KOps/s $\color{#d91a1a}-0.70\%$
test_compile_add_one_flat[tensorclass-compile] 0.1657ms 0.1301ms 7.6890 KOps/s 7.7219 KOps/s $\color{#d91a1a}-0.43\%$
test_compile_add_one_flat[tensorclass-eager] 0.1387ms 60.7001μs 16.4744 KOps/s 16.3064 KOps/s $\color{#35bf28}+1.03\%$
test_compile_add_one_flat[pytree-compile] 0.3862ms 0.3239ms 3.0876 KOps/s 3.1017 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_add_one_flat[pytree-eager] 0.6951ms 0.6378ms 1.5678 KOps/s 1.5712 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_add_self_flat[tensordict-eager] 0.3073ms 0.2499ms 4.0023 KOps/s 3.9970 KOps/s $\color{#35bf28}+0.13\%$
test_compile_add_self_flat[tensordict-compile] 0.4165ms 0.3254ms 3.0734 KOps/s 3.0930 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_add_self_flat[tensorclass-eager] 0.1586ms 70.8631μs 14.1117 KOps/s 13.6852 KOps/s $\color{#35bf28}+3.12\%$
test_compile_add_self_flat[tensorclass-compile] 0.2043ms 0.1312ms 7.6206 KOps/s 7.6002 KOps/s $\color{#35bf28}+0.27\%$
test_compile_add_self_flat[pytree-eager] 0.7273ms 0.5446ms 1.8360 KOps/s 1.8137 KOps/s $\color{#35bf28}+1.23\%$
test_compile_add_self_flat[pytree-compile] 0.3910ms 0.3237ms 3.0888 KOps/s 3.0973 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_copy_flat[tensordict-compile] 57.0110μs 18.4401μs 54.2298 KOps/s 55.9798 KOps/s $\color{#d91a1a}-3.13\%$
test_compile_copy_flat[tensordict-eager] 79.7910μs 27.0998μs 36.9007 KOps/s 37.4042 KOps/s $\color{#d91a1a}-1.35\%$
test_compile_copy_flat[pytree-compile] 0.1063ms 69.8313μs 14.3202 KOps/s 14.3275 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_copy_flat[pytree-eager] 80.4820μs 51.3471μs 19.4753 KOps/s 19.2337 KOps/s $\color{#35bf28}+1.26\%$
test_compile_assign_and_add[tensordict-compile] 2.3458ms 0.8340ms 1.1990 KOps/s 1.1240 KOps/s $\textbf{\color{#35bf28}+6.68\%}$
test_compile_assign_and_add[tensordict-eager] 3.3057ms 3.2078ms 311.7359 Ops/s 310.5573 Ops/s $\color{#35bf28}+0.38\%$
test_compile_assign_and_add[pytree-compile] 2.3446ms 0.8300ms 1.2048 KOps/s 1.1390 KOps/s $\textbf{\color{#35bf28}+5.78\%}$
test_compile_assign_and_add[pytree-eager] 3.4533ms 3.2461ms 308.0589 Ops/s 304.9034 Ops/s $\color{#35bf28}+1.03\%$
test_compile_indexing[tensor-tensordict-compile] 0.1693ms 0.1157ms 8.6452 KOps/s 9.1524 KOps/s $\textbf{\color{#d91a1a}-5.54\%}$
test_compile_indexing[tensor-tensordict-eager] 0.1952ms 64.0780μs 15.6060 KOps/s 15.3625 KOps/s $\color{#35bf28}+1.59\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1858ms 0.1091ms 9.1673 KOps/s 9.5089 KOps/s $\color{#d91a1a}-3.59\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1146ms 46.1190μs 21.6831 KOps/s 22.2570 KOps/s $\color{#d91a1a}-2.58\%$
test_compile_indexing[tensor-pytree-compile] 0.1652ms 0.1103ms 9.0702 KOps/s 9.4296 KOps/s $\color{#d91a1a}-3.81\%$
test_compile_indexing[tensor-pytree-eager] 0.1344ms 46.6313μs 21.4448 KOps/s 22.3199 KOps/s $\color{#d91a1a}-3.92\%$
test_compile_indexing[slice-tensordict-compile] 0.1837ms 0.1401ms 7.1367 KOps/s 7.1768 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_indexing[slice-tensordict-eager] 0.1526ms 25.1668μs 39.7348 KOps/s 39.6013 KOps/s $\color{#35bf28}+0.34\%$
test_compile_indexing[slice-tensorclass-compile] 0.1947ms 0.1385ms 7.2228 KOps/s 7.5031 KOps/s $\color{#d91a1a}-3.73\%$
test_compile_indexing[slice-tensorclass-eager] 80.8010μs 21.0313μs 47.5482 KOps/s 46.3597 KOps/s $\color{#35bf28}+2.56\%$
test_compile_indexing[slice-pytree-compile] 0.1923ms 0.1359ms 7.3585 KOps/s 7.4061 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_indexing[slice-pytree-eager] 63.7010μs 20.9698μs 47.6876 KOps/s 46.2657 KOps/s $\color{#35bf28}+3.07\%$
test_compile_indexing[int-tensordict-compile] 0.1829ms 0.1416ms 7.0635 KOps/s 7.1321 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_indexing[int-tensordict-eager] 0.5016ms 25.3727μs 39.4125 KOps/s 39.3310 KOps/s $\color{#35bf28}+0.21\%$
test_compile_indexing[int-tensorclass-compile] 0.2313ms 0.1369ms 7.3042 KOps/s 7.4671 KOps/s $\color{#d91a1a}-2.18\%$
test_compile_indexing[int-tensorclass-eager] 0.1267ms 23.6764μs 42.2362 KOps/s 46.3646 KOps/s $\textbf{\color{#d91a1a}-8.90\%}$
test_compile_indexing[int-pytree-compile] 0.2024ms 0.1353ms 7.3916 KOps/s 7.2368 KOps/s $\color{#35bf28}+2.14\%$
test_compile_indexing[int-pytree-eager] 76.4010μs 21.3699μs 46.7948 KOps/s 45.3328 KOps/s $\color{#35bf28}+3.23\%$
test_mod_add[eager] 89.7220μs 35.2849μs 28.3408 KOps/s 27.4433 KOps/s $\color{#35bf28}+3.27\%$
test_mod_add[compile] 0.2375ms 69.1697μs 14.4572 KOps/s 13.6593 KOps/s $\textbf{\color{#35bf28}+5.84\%}$
test_mod_add[compile-overhead] 0.2613ms 0.1359ms 7.3607 KOps/s 6.6388 KOps/s $\textbf{\color{#35bf28}+10.87\%}$
test_mod_wrap[eager] 0.3568ms 0.2552ms 3.9187 KOps/s 3.8714 KOps/s $\color{#35bf28}+1.22\%$
test_mod_wrap[compile] 0.3616ms 0.2946ms 3.3948 KOps/s 3.3165 KOps/s $\color{#35bf28}+2.36\%$
test_mod_wrap[compile-overhead] 7.6058ms 4.0684ms 245.7952 Ops/s 248.2664 Ops/s $\color{#d91a1a}-1.00\%$
test_mod_wrap_and_backward[eager] 1.5439ms 1.4667ms 681.7817 Ops/s 679.5044 Ops/s $\color{#35bf28}+0.34\%$
test_mod_wrap_and_backward[compile] 1.5614ms 1.3223ms 756.2562 Ops/s 695.9460 Ops/s $\textbf{\color{#35bf28}+8.67\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3340ms 0.9026ms 1.1079 KOps/s 983.1183 Ops/s $\textbf{\color{#35bf28}+12.69\%}$
test_seq_add[eager] 0.1552ms 0.1017ms 9.8352 KOps/s 9.6708 KOps/s $\color{#35bf28}+1.70\%$
test_seq_add[compile] 0.1463ms 82.8238μs 12.0738 KOps/s 12.3115 KOps/s $\color{#d91a1a}-1.93\%$
test_seq_add[compile-overhead] 0.1555ms 0.1157ms 8.6403 KOps/s 8.5804 KOps/s $\color{#35bf28}+0.70\%$
test_seq_wrap[eager] 0.4628ms 0.3866ms 2.5866 KOps/s 2.4064 KOps/s $\textbf{\color{#35bf28}+7.49\%}$
test_seq_wrap[compile] 0.3872ms 0.3207ms 3.1182 KOps/s 3.0036 KOps/s $\color{#35bf28}+3.81\%$
test_seq_wrap[compile-overhead] 0.3296ms 0.2232ms 4.4795 KOps/s 4.4891 KOps/s $\color{#d91a1a}-0.22\%$
test_func_call_runtime[False-eager] 0.9562ms 0.7533ms 1.3275 KOps/s 1.3062 KOps/s $\color{#35bf28}+1.63\%$
test_func_call_runtime[False-compile] 0.8465ms 0.7931ms 1.2609 KOps/s 1.2407 KOps/s $\color{#35bf28}+1.63\%$
test_func_call_runtime[False-compile-overhead] 0.4097ms 0.3625ms 2.7586 KOps/s 2.7402 KOps/s $\color{#35bf28}+0.67\%$
test_func_call_runtime[True-eager] 0.9757ms 0.9052ms 1.1047 KOps/s 1.0781 KOps/s $\color{#35bf28}+2.47\%$
test_func_call_runtime[True-compile] 0.8780ms 0.8296ms 1.2054 KOps/s 1.1949 KOps/s $\color{#35bf28}+0.87\%$
test_func_call_runtime[True-compile-overhead] 0.5140ms 0.4011ms 2.4931 KOps/s 2.5255 KOps/s $\color{#d91a1a}-1.28\%$
test_func_call_cm_runtime[False-eager] 0.8485ms 0.7436ms 1.3448 KOps/s 1.2996 KOps/s $\color{#35bf28}+3.48\%$
test_func_call_cm_runtime[False-compile] 0.8646ms 0.7965ms 1.2554 KOps/s 1.2379 KOps/s $\color{#35bf28}+1.42\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4314ms 0.3678ms 2.7188 KOps/s 2.7316 KOps/s $\color{#d91a1a}-0.47\%$
test_func_call_cm_runtime[True-eager] 1.1163ms 1.0102ms 989.9001 Ops/s 971.6330 Ops/s $\color{#35bf28}+1.88\%$
test_func_call_cm_runtime[True-compile] 0.9660ms 0.8628ms 1.1591 KOps/s 1.1603 KOps/s $\color{#d91a1a}-0.10\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4753ms 0.4218ms 2.3706 KOps/s 2.3716 KOps/s $\color{#d91a1a}-0.04\%$
test_vmap_func_call_cm_runtime[eager] 2.5595ms 2.0707ms 482.9378 Ops/s 475.0028 Ops/s $\color{#35bf28}+1.67\%$
test_vmap_func_call_cm_runtime[compile] 0.9690ms 0.8714ms 1.1476 KOps/s 1.1329 KOps/s $\color{#35bf28}+1.30\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4746ms 0.4294ms 2.3291 KOps/s 2.3497 KOps/s $\color{#d91a1a}-0.88\%$
test_distributed 2.1365ms 0.2300ms 4.3477 KOps/s 8.8904 KOps/s $\textbf{\color{#d91a1a}-51.10\%}$
test_tdmodule 48.8910μs 15.5873μs 64.1547 KOps/s 63.1715 KOps/s $\color{#35bf28}+1.56\%$
test_tdmodule_dispatch 59.0020μs 30.7902μs 32.4778 KOps/s 31.6663 KOps/s $\color{#35bf28}+2.56\%$
test_tdseq 37.8310μs 15.9854μs 62.5570 KOps/s 60.3447 KOps/s $\color{#35bf28}+3.67\%$
test_tdseq_dispatch 54.0120μs 31.8468μs 31.4003 KOps/s 29.4431 KOps/s $\textbf{\color{#35bf28}+6.65\%}$
test_instantiation_functorch 2.1140ms 1.8764ms 532.9425 Ops/s 529.8515 Ops/s $\color{#35bf28}+0.58\%$
test_instantiation_td 1.8326ms 1.2046ms 830.1245 Ops/s 823.1953 Ops/s $\color{#35bf28}+0.84\%$
test_exec_functorch 0.2682ms 0.2121ms 4.7138 KOps/s 4.6673 KOps/s $\color{#35bf28}+1.00\%$
test_exec_functional_call 0.2550ms 0.2109ms 4.7405 KOps/s 4.6726 KOps/s $\color{#35bf28}+1.45\%$
test_exec_td 0.3271ms 0.2177ms 4.5932 KOps/s 4.5419 KOps/s $\color{#35bf28}+1.13\%$
test_exec_td_decorator 0.9893ms 0.2574ms 3.8843 KOps/s 3.7879 KOps/s $\color{#35bf28}+2.54\%$
test_vmap_mlp_speed[True-True] 0.8111ms 0.6894ms 1.4506 KOps/s 1.4377 KOps/s $\color{#35bf28}+0.89\%$
test_vmap_mlp_speed[True-False] 0.8622ms 0.7140ms 1.4005 KOps/s 1.4374 KOps/s $\color{#d91a1a}-2.56\%$
test_vmap_mlp_speed[False-True] 0.6776ms 0.5750ms 1.7390 KOps/s 1.7170 KOps/s $\color{#35bf28}+1.28\%$
test_vmap_mlp_speed[False-False] 0.6235ms 0.5749ms 1.7393 KOps/s 1.7123 KOps/s $\color{#35bf28}+1.58\%$
test_vmap_mlp_speed_decorator[True-True] 1.3755ms 0.6703ms 1.4919 KOps/s 1.4691 KOps/s $\color{#35bf28}+1.55\%$
test_vmap_mlp_speed_decorator[True-False] 0.8036ms 0.6686ms 1.4957 KOps/s 1.4677 KOps/s $\color{#35bf28}+1.91\%$
test_vmap_mlp_speed_decorator[False-True] 0.6983ms 0.5869ms 1.7040 KOps/s 1.6775 KOps/s $\color{#35bf28}+1.58\%$
test_vmap_mlp_speed_decorator[False-False] 0.7089ms 0.5891ms 1.6975 KOps/s 1.6771 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_transformer_speed[True-True] 8.5556ms 8.4235ms 118.7157 Ops/s 117.8168 Ops/s $\color{#35bf28}+0.76\%$
test_vmap_transformer_speed[True-False] 8.8198ms 8.4365ms 118.5327 Ops/s 117.2391 Ops/s $\color{#35bf28}+1.10\%$
test_vmap_transformer_speed[False-True] 8.6665ms 8.2720ms 120.8901 Ops/s 120.4641 Ops/s $\color{#35bf28}+0.35\%$
test_vmap_transformer_speed[False-False] 8.6090ms 8.2300ms 121.5066 Ops/s 120.5777 Ops/s $\color{#35bf28}+0.77\%$
test_vmap_transformer_speed_decorator[True-True] 20.2737ms 19.7200ms 50.7099 Ops/s 50.4580 Ops/s $\color{#35bf28}+0.50\%$
test_vmap_transformer_speed_decorator[True-False] 20.5164ms 19.6288ms 50.9455 Ops/s 50.4331 Ops/s $\color{#35bf28}+1.02\%$
test_vmap_transformer_speed_decorator[False-True] 19.6287ms 19.4993ms 51.2838 Ops/s 50.8736 Ops/s $\color{#35bf28}+0.81\%$
test_vmap_transformer_speed_decorator[False-False] 20.7911ms 19.5262ms 51.2133 Ops/s 50.7513 Ops/s $\color{#35bf28}+0.91\%$
test_to_module_speed[True] 1.4670ms 0.9469ms 1.0561 KOps/s 1.0741 KOps/s $\color{#d91a1a}-1.68\%$
test_to_module_speed[False] 1.3088ms 0.9155ms 1.0923 KOps/s 1.0988 KOps/s $\color{#d91a1a}-0.59\%$
test_tc_init 65.9910μs 34.5320μs 28.9587 KOps/s 27.7234 KOps/s $\color{#35bf28}+4.46\%$
test_tc_init_nested 0.1184ms 69.1099μs 14.4697 KOps/s 13.9022 KOps/s $\color{#35bf28}+4.08\%$
test_tc_first_layer_tensor 7.2116μs 0.6644μs 1.5052 MOps/s 1.4586 MOps/s $\color{#35bf28}+3.19\%$
test_tc_first_layer_nontensor 33.8700μs 2.2039μs 453.7377 KOps/s 441.2389 KOps/s $\color{#35bf28}+2.83\%$
test_tc_second_layer_tensor 30.9533μs 1.3768μs 726.3118 KOps/s 736.0129 KOps/s $\color{#d91a1a}-1.32\%$
test_tc_second_layer_nontensor 24.6710μs 2.9445μs 339.6155 KOps/s 334.9609 KOps/s $\color{#35bf28}+1.39\%$
test_unbind 0.1958s 12.1862ms 82.0603 Ops/s 93.5008 Ops/s $\textbf{\color{#d91a1a}-12.24\%}$
test_full_like 0.6557ms 0.5737ms 1.7431 KOps/s 1.7312 KOps/s $\color{#35bf28}+0.68\%$
test_zeros_like 0.2772ms 0.1979ms 5.0533 KOps/s 5.0531 KOps/s $+0.00\%$
test_ones_like 0.2330ms 0.1977ms 5.0593 KOps/s 5.0567 KOps/s $\color{#35bf28}+0.05\%$
test_clone 0.4446ms 0.4145ms 2.4123 KOps/s 2.4119 KOps/s $\color{#35bf28}+0.02\%$
test_squeeze 38.6910μs 10.0831μs 99.1761 KOps/s 99.6271 KOps/s $\color{#d91a1a}-0.45\%$
test_unsqueeze 0.2211ms 75.8730μs 13.1799 KOps/s 13.4675 KOps/s $\color{#d91a1a}-2.14\%$
test_split 0.4399ms 0.1582ms 6.3227 KOps/s 6.2007 KOps/s $\color{#35bf28}+1.97\%$
test_permute 0.2212ms 0.1736ms 5.7612 KOps/s 5.6049 KOps/s $\color{#35bf28}+2.79\%$
test_stack 1.2529ms 0.8636ms 1.1580 KOps/s 1.1811 KOps/s $\color{#d91a1a}-1.96\%$
test_cat 1.2540ms 1.2316ms 811.9620 Ops/s 811.7120 Ops/s $\color{#35bf28}+0.03\%$

@vmoens
Copy link
Contributor Author

vmoens commented Aug 13, 2024

closes #960

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 9, 2024
ghstack-source-id: f17c5474b1f55a1935e57fc857ed05eeee890057
Pull Request resolved: #965
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 16, 2024
ghstack-source-id: 79d007d309ed933ed0c2023f18d9598969f2e5bb
Pull Request resolved: #965
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 16, 2024
ghstack-source-id: af5b112bc5a3dbabf5c490026f883bac89a6a052
Pull Request resolved: #965
@vmoens vmoens added the enhancement New feature or request label Sep 16, 2024
@vmoens vmoens merged commit fb016b7 into gh/vmoens/10/base Sep 16, 2024
42 of 46 checks passed
vmoens added a commit that referenced this pull request Sep 16, 2024
ghstack-source-id: af5b112bc5a3dbabf5c490026f883bac89a6a052
Pull Request resolved: #965
@vmoens vmoens deleted the gh/vmoens/10/head branch September 16, 2024 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants