Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Deserializing a consolidated TD reproduces a consolidated TD #1019

Merged
merged 3 commits into from
Oct 3, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 2, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 2, 2024
Copy link

github-actions bot commented Oct 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}18$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 59.0500μs 25.2437μs 39.6139 KOps/s 40.6596 KOps/s $\color{#d91a1a}-2.57\%$
test_plain_set_stack_nested 84.5680μs 25.6750μs 38.9484 KOps/s 39.8994 KOps/s $\color{#d91a1a}-2.38\%$
test_plain_set_nested_inplace 60.5030μs 27.7147μs 36.0819 KOps/s 37.1952 KOps/s $\color{#d91a1a}-2.99\%$
test_plain_set_stack_nested_inplace 67.3350μs 27.8459μs 35.9119 KOps/s 37.0768 KOps/s $\color{#d91a1a}-3.14\%$
test_items 24.8860μs 4.2378μs 235.9736 KOps/s 239.1530 KOps/s $\color{#d91a1a}-1.33\%$
test_items_nested 0.4829ms 0.3810ms 2.6248 KOps/s 2.4793 KOps/s $\textbf{\color{#35bf28}+5.87\%}$
test_items_nested_locked 0.7188ms 0.3848ms 2.5989 KOps/s 2.6149 KOps/s $\color{#d91a1a}-0.61\%$
test_items_nested_leaf 0.1707ms 80.8330μs 12.3712 KOps/s 12.4399 KOps/s $\color{#d91a1a}-0.55\%$
test_items_stack_nested 0.7212ms 0.3871ms 2.5835 KOps/s 2.4960 KOps/s $\color{#35bf28}+3.50\%$
test_items_stack_nested_leaf 0.1806ms 84.6478μs 11.8137 KOps/s 11.7669 KOps/s $\color{#35bf28}+0.40\%$
test_items_stack_nested_locked 0.8796ms 0.3947ms 2.5337 KOps/s 2.5621 KOps/s $\color{#d91a1a}-1.11\%$
test_keys 22.6530μs 3.4920μs 286.3668 KOps/s 288.3002 KOps/s $\color{#d91a1a}-0.67\%$
test_keys_nested 0.2544ms 0.1345ms 7.4348 KOps/s 7.3307 KOps/s $\color{#35bf28}+1.42\%$
test_keys_nested_locked 1.7004ms 0.1390ms 7.1961 KOps/s 7.0542 KOps/s $\color{#35bf28}+2.01\%$
test_keys_nested_leaf 0.2114ms 0.1176ms 8.5013 KOps/s 8.3779 KOps/s $\color{#35bf28}+1.47\%$
test_keys_stack_nested 0.2621ms 0.1349ms 7.4143 KOps/s 7.4429 KOps/s $\color{#d91a1a}-0.38\%$
test_keys_stack_nested_leaf 0.2548ms 0.1183ms 8.4534 KOps/s 8.5482 KOps/s $\color{#d91a1a}-1.11\%$
test_keys_stack_nested_locked 0.2647ms 0.1397ms 7.1578 KOps/s 6.9916 KOps/s $\color{#35bf28}+2.38\%$
test_values 8.2454μs 1.0495μs 952.8735 KOps/s 956.5693 KOps/s $\color{#d91a1a}-0.39\%$
test_values_nested 0.1619ms 94.5524μs 10.5762 KOps/s 10.6399 KOps/s $\color{#d91a1a}-0.60\%$
test_values_nested_locked 0.1773ms 94.5683μs 10.5744 KOps/s 10.6489 KOps/s $\color{#d91a1a}-0.70\%$
test_values_nested_leaf 0.1331ms 81.0676μs 12.3354 KOps/s 12.5583 KOps/s $\color{#d91a1a}-1.77\%$
test_values_stack_nested 0.2103ms 94.8277μs 10.5454 KOps/s 10.5714 KOps/s $\color{#d91a1a}-0.25\%$
test_values_stack_nested_leaf 0.1528ms 82.0013μs 12.1949 KOps/s 12.6633 KOps/s $\color{#d91a1a}-3.70\%$
test_values_stack_nested_locked 0.1834ms 94.9495μs 10.5319 KOps/s 10.6312 KOps/s $\color{#d91a1a}-0.93\%$
test_membership 11.4920μs 0.8847μs 1.1303 MOps/s 1.1335 MOps/s $\color{#d91a1a}-0.28\%$
test_membership_nested 24.2650μs 2.8039μs 356.6401 KOps/s 365.6378 KOps/s $\color{#d91a1a}-2.46\%$
test_membership_nested_leaf 18.9850μs 2.7493μs 363.7224 KOps/s 361.8543 KOps/s $\color{#35bf28}+0.52\%$
test_membership_stacked_nested 35.9180μs 2.7506μs 363.5526 KOps/s 372.0569 KOps/s $\color{#d91a1a}-2.29\%$
test_membership_stacked_nested_leaf 19.6670μs 2.7674μs 361.3436 KOps/s 363.1602 KOps/s $\color{#d91a1a}-0.50\%$
test_membership_nested_last 26.5700μs 4.2236μs 236.7667 KOps/s 237.6647 KOps/s $\color{#d91a1a}-0.38\%$
test_membership_nested_leaf_last 22.8430μs 4.2362μs 236.0623 KOps/s 236.4647 KOps/s $\color{#d91a1a}-0.17\%$
test_membership_stacked_nested_last 27.1400μs 4.2089μs 237.5913 KOps/s 202.1277 KOps/s $\textbf{\color{#35bf28}+17.55\%}$
test_membership_stacked_nested_leaf_last 25.4870μs 4.1963μs 238.3056 KOps/s 200.7919 KOps/s $\textbf{\color{#35bf28}+18.68\%}$
test_nested_getleaf 34.8950μs 10.5320μs 94.9483 KOps/s 92.4410 KOps/s $\color{#35bf28}+2.71\%$
test_nested_get 31.1580μs 9.8692μs 101.3249 KOps/s 97.8560 KOps/s $\color{#35bf28}+3.54\%$
test_stacked_getleaf 47.2580μs 10.3961μs 96.1902 KOps/s 93.1325 KOps/s $\color{#35bf28}+3.28\%$
test_stacked_get 30.5270μs 9.9380μs 100.6235 KOps/s 98.4349 KOps/s $\color{#35bf28}+2.22\%$
test_nested_getitemleaf 34.0940μs 10.9460μs 91.3572 KOps/s 87.8147 KOps/s $\color{#35bf28}+4.03\%$
test_nested_getitem 42.1090μs 10.2798μs 97.2785 KOps/s 94.5554 KOps/s $\color{#35bf28}+2.88\%$
test_stacked_getitemleaf 38.4720μs 11.0118μs 90.8119 KOps/s 89.2465 KOps/s $\color{#35bf28}+1.75\%$
test_stacked_getitem 51.3590μs 10.0426μs 99.5760 KOps/s 96.6958 KOps/s $\color{#35bf28}+2.98\%$
test_lock_nested 92.3414ms 0.6084ms 1.6438 KOps/s 1.9744 KOps/s $\textbf{\color{#d91a1a}-16.75\%}$
test_lock_stack_nested 0.6091ms 0.4824ms 2.0731 KOps/s 2.1271 KOps/s $\color{#d91a1a}-2.54\%$
test_unlock_nested 94.5816ms 0.5305ms 1.8850 KOps/s 2.3693 KOps/s $\textbf{\color{#d91a1a}-20.44\%}$
test_unlock_stack_nested 0.6878ms 0.3961ms 2.5246 KOps/s 2.5963 KOps/s $\color{#d91a1a}-2.76\%$
test_flatten_speed 0.2155ms 0.1028ms 9.7271 KOps/s 9.9148 KOps/s $\color{#d91a1a}-1.89\%$
test_unflatten_speed 1.0408ms 0.5251ms 1.9042 KOps/s 1.9196 KOps/s $\color{#d91a1a}-0.80\%$
test_common_ops 3.7642ms 1.2029ms 831.3564 Ops/s 883.3843 Ops/s $\textbf{\color{#d91a1a}-5.89\%}$
test_creation 21.8900μs 2.0905μs 478.3546 KOps/s 468.6534 KOps/s $\color{#35bf28}+2.07\%$
test_creation_empty 59.3700μs 20.6645μs 48.3923 KOps/s 54.3064 KOps/s $\textbf{\color{#d91a1a}-10.89\%}$
test_creation_nested_1 67.6360μs 22.7572μs 43.9421 KOps/s 46.5487 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_creation_nested_2 69.3790μs 28.0882μs 35.6022 KOps/s 38.2196 KOps/s $\textbf{\color{#d91a1a}-6.85\%}$
test_clone 0.1288ms 16.7101μs 59.8442 KOps/s 59.2513 KOps/s $\color{#35bf28}+1.00\%$
test_getitem[int] 1.3485ms 16.9369μs 59.0426 KOps/s 60.3939 KOps/s $\color{#d91a1a}-2.24\%$
test_getitem[slice_int] 0.1415ms 30.3537μs 32.9449 KOps/s 32.3013 KOps/s $\color{#35bf28}+1.99\%$
test_getitem[range] 0.3005ms 58.2066μs 17.1802 KOps/s 16.7597 KOps/s $\color{#35bf28}+2.51\%$
test_getitem[tuple] 0.1331ms 25.4545μs 39.2858 KOps/s 40.1834 KOps/s $\color{#d91a1a}-2.23\%$
test_getitem[list] 0.2483ms 54.2026μs 18.4493 KOps/s 18.5125 KOps/s $\color{#d91a1a}-0.34\%$
test_setitem_dim[int] 66.2030μs 33.6136μs 29.7498 KOps/s 31.3030 KOps/s $\color{#d91a1a}-4.96\%$
test_setitem_dim[slice_int] 0.1211ms 61.6899μs 16.2101 KOps/s 16.1250 KOps/s $\color{#35bf28}+0.53\%$
test_setitem_dim[range] 0.1245ms 85.6428μs 11.6764 KOps/s 11.6761 KOps/s $+0.00\%$
test_setitem_dim[tuple] 0.1151ms 49.6002μs 20.1612 KOps/s 20.3469 KOps/s $\color{#d91a1a}-0.91\%$
test_setitem 0.3345ms 31.1846μs 32.0671 KOps/s 33.6098 KOps/s $\color{#d91a1a}-4.59\%$
test_set 0.2464ms 29.7124μs 33.6560 KOps/s 33.5810 KOps/s $\color{#35bf28}+0.22\%$
test_set_shared 3.8888ms 0.2193ms 4.5589 KOps/s 4.6096 KOps/s $\color{#d91a1a}-1.10\%$
test_update 0.1539ms 39.0935μs 25.5797 KOps/s 26.6731 KOps/s $\color{#d91a1a}-4.10\%$
test_update_nested 0.1734ms 51.1274μs 19.5590 KOps/s 20.4796 KOps/s $\color{#d91a1a}-4.50\%$
test_update__nested 0.1257ms 37.8373μs 26.4290 KOps/s 26.5791 KOps/s $\color{#d91a1a}-0.57\%$
test_set_nested 0.1905ms 33.7479μs 29.6314 KOps/s 30.8000 KOps/s $\color{#d91a1a}-3.79\%$
test_set_nested_new 94.5670μs 39.1500μs 25.5428 KOps/s 26.7531 KOps/s $\color{#d91a1a}-4.52\%$
test_select 0.1138ms 55.0728μs 18.1578 KOps/s 17.9496 KOps/s $\color{#35bf28}+1.16\%$
test_select_nested 0.1466ms 62.0866μs 16.1065 KOps/s 16.7524 KOps/s $\color{#d91a1a}-3.86\%$
test_exclude_nested 0.1427ms 76.0495μs 13.1493 KOps/s 13.1429 KOps/s $\color{#35bf28}+0.05\%$
test_empty[True] 0.5005ms 0.3502ms 2.8551 KOps/s 2.8382 KOps/s $\color{#35bf28}+0.59\%$
test_empty[False] 6.7225μs 1.2577μs 795.1270 KOps/s 820.1906 KOps/s $\color{#d91a1a}-3.06\%$
test_unbind_speed 0.4967ms 0.3007ms 3.3260 KOps/s 3.3170 KOps/s $\color{#35bf28}+0.27\%$
test_unbind_speed_stack0 1.0614ms 0.3109ms 3.2161 KOps/s 3.3692 KOps/s $\color{#d91a1a}-4.54\%$
test_unbind_speed_stack1 99.1491ms 0.8289ms 1.2065 KOps/s 1.3406 KOps/s $\textbf{\color{#d91a1a}-10.00\%}$
test_split 3.1870ms 1.9782ms 505.5108 Ops/s 446.2051 Ops/s $\textbf{\color{#35bf28}+13.29\%}$
test_chunk 0.1007s 2.1897ms 456.6854 Ops/s 445.3796 Ops/s $\color{#35bf28}+2.54\%$
test_creation[device0] 0.2286ms 0.1170ms 8.5456 KOps/s 8.3852 KOps/s $\color{#35bf28}+1.91\%$
test_creation_from_tensor 3.8654ms 0.1194ms 8.3741 KOps/s 8.5373 KOps/s $\color{#d91a1a}-1.91\%$
test_add_one[memmap_tensor0] 0.3152ms 7.2997μs 136.9920 KOps/s 133.1742 KOps/s $\color{#35bf28}+2.87\%$
test_contiguous[memmap_tensor0] 15.2580μs 1.8785μs 532.3255 KOps/s 502.0540 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_stack[memmap_tensor0] 79.0970μs 5.4358μs 183.9672 KOps/s 173.4378 KOps/s $\textbf{\color{#35bf28}+6.07\%}$
test_memmaptd_index 1.1137ms 0.4261ms 2.3468 KOps/s 2.3987 KOps/s $\color{#d91a1a}-2.16\%$
test_memmaptd_index_astensor 0.9965ms 0.5332ms 1.8754 KOps/s 1.9216 KOps/s $\color{#d91a1a}-2.41\%$
test_memmaptd_index_op 3.4557ms 1.1142ms 897.5006 Ops/s 907.9353 Ops/s $\color{#d91a1a}-1.15\%$
test_serialize_model 0.1284s 0.1252s 7.9894 Ops/s 8.3835 Ops/s $\color{#d91a1a}-4.70\%$
test_serialize_model_pickle 0.4714s 0.4010s 2.4935 Ops/s 2.5575 Ops/s $\color{#d91a1a}-2.50\%$
test_serialize_weights 0.1276s 0.1176s 8.5040 Ops/s 8.4288 Ops/s $\color{#35bf28}+0.89\%$
test_serialize_weights_returnearly 0.1807s 0.1665s 6.0058 Ops/s 6.1218 Ops/s $\color{#d91a1a}-1.90\%$
test_serialize_weights_pickle 0.4579s 0.4139s 2.4159 Ops/s 2.4854 Ops/s $\color{#d91a1a}-2.79\%$
test_serialize_weights_filesystem 0.1547s 0.1457s 6.8619 Ops/s 6.9823 Ops/s $\color{#d91a1a}-1.72\%$
test_serialize_model_filesystem 0.1665s 0.1524s 6.5633 Ops/s 5.6947 Ops/s $\textbf{\color{#35bf28}+15.25\%}$
test_reshape_pytree 0.1072ms 38.3799μs 26.0553 KOps/s 25.3927 KOps/s $\color{#35bf28}+2.61\%$
test_reshape_td 92.4630μs 46.0831μs 21.6999 KOps/s 21.0170 KOps/s $\color{#35bf28}+3.25\%$
test_view_pytree 84.1670μs 38.0042μs 26.3129 KOps/s 25.5270 KOps/s $\color{#35bf28}+3.08\%$
test_view_td 0.1075ms 52.2361μs 19.1438 KOps/s 18.8815 KOps/s $\color{#35bf28}+1.39\%$
test_unbind_pytree 72.5360μs 35.8799μs 27.8707 KOps/s 27.3675 KOps/s $\color{#35bf28}+1.84\%$
test_unbind_td 0.3351ms 45.7947μs 21.8366 KOps/s 22.4236 KOps/s $\color{#d91a1a}-2.62\%$
test_split_pytree 87.7630μs 38.4455μs 26.0109 KOps/s 26.0980 KOps/s $\color{#d91a1a}-0.33\%$
test_split_td 0.4959ms 59.1452μs 16.9076 KOps/s 17.1000 KOps/s $\color{#d91a1a}-1.13\%$
test_add_pytree 0.1541ms 44.4951μs 22.4744 KOps/s 21.8177 KOps/s $\color{#35bf28}+3.01\%$
test_add_td 0.1704ms 91.0156μs 10.9871 KOps/s 11.0734 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_add_one_nested[tensordict-compile] 0.1376ms 59.0796μs 16.9263 KOps/s 16.8796 KOps/s $\color{#35bf28}+0.28\%$
test_compile_add_one_nested[tensordict-eager] 0.4076ms 0.2018ms 4.9555 KOps/s 5.0168 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_add_one_nested[pytree-compile] 0.1180ms 58.3211μs 17.1464 KOps/s 17.4361 KOps/s $\color{#d91a1a}-1.66\%$
test_compile_add_one_nested[pytree-eager] 0.2444ms 0.1395ms 7.1673 KOps/s 7.0115 KOps/s $\color{#35bf28}+2.22\%$
test_compile_copy_nested[tensordict-compile] 59.1510μs 23.7441μs 42.1158 KOps/s 41.4916 KOps/s $\color{#35bf28}+1.50\%$
test_compile_copy_nested[tensordict-eager] 0.1493ms 75.5966μs 13.2281 KOps/s 13.0299 KOps/s $\color{#35bf28}+1.52\%$
test_compile_copy_nested[pytree-compile] 0.1438ms 76.3003μs 13.1061 KOps/s 13.0167 KOps/s $\color{#35bf28}+0.69\%$
test_compile_copy_nested[pytree-eager] 0.1256ms 69.2352μs 14.4435 KOps/s 14.5789 KOps/s $\color{#d91a1a}-0.93\%$
test_compile_add_one_flat[tensordict-compile] 0.2899ms 0.1830ms 5.4641 KOps/s 5.5051 KOps/s $\color{#d91a1a}-0.74\%$
test_compile_add_one_flat[tensordict-eager] 0.4318ms 0.2401ms 4.1655 KOps/s 4.1420 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_one_flat[tensorclass-compile] 0.1162ms 50.0054μs 19.9979 KOps/s 21.1822 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_compile_add_one_flat[tensorclass-eager] 0.1810ms 78.2313μs 12.7826 KOps/s 12.7067 KOps/s $\color{#35bf28}+0.60\%$
test_compile_add_one_flat[pytree-compile] 0.3634ms 0.1774ms 5.6372 KOps/s 5.6842 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_add_one_flat[pytree-eager] 1.1721ms 0.2881ms 3.4708 KOps/s 3.4349 KOps/s $\color{#35bf28}+1.04\%$
test_compile_add_self_flat[tensordict-eager] 0.4340ms 0.2767ms 3.6142 KOps/s 3.5999 KOps/s $\color{#35bf28}+0.40\%$
test_compile_add_self_flat[tensordict-compile] 0.7029ms 0.1859ms 5.3779 KOps/s 5.4839 KOps/s $\color{#d91a1a}-1.93\%$
test_compile_add_self_flat[tensorclass-eager] 0.1640ms 73.6760μs 13.5729 KOps/s 13.8849 KOps/s $\color{#d91a1a}-2.25\%$
test_compile_add_self_flat[tensorclass-compile] 0.1198ms 48.6884μs 20.5388 KOps/s 20.8385 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_add_self_flat[pytree-eager] 0.4876ms 0.2325ms 4.3019 KOps/s 4.3011 KOps/s $\color{#35bf28}+0.02\%$
test_compile_add_self_flat[pytree-compile] 0.3581ms 0.1775ms 5.6329 KOps/s 5.6927 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_copy_flat[tensordict-compile] 0.1961ms 0.1114ms 8.9736 KOps/s 9.0154 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_copy_flat[tensordict-eager] 0.1757ms 80.0509μs 12.4921 KOps/s 12.6223 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_copy_flat[pytree-compile] 0.1486ms 80.2075μs 12.4677 KOps/s 12.5683 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_copy_flat[pytree-eager] 0.1467ms 71.4195μs 14.0018 KOps/s 14.1988 KOps/s $\color{#d91a1a}-1.39\%$
test_compile_assign_and_add[tensordict-compile] 0.2959ms 0.1962ms 5.0957 KOps/s 5.0471 KOps/s $\color{#35bf28}+0.96\%$
test_compile_assign_and_add[tensordict-eager] 2.8433ms 1.7779ms 562.4745 Ops/s 548.3042 Ops/s $\color{#35bf28}+2.58\%$
test_compile_assign_and_add[pytree-compile] 0.2849ms 0.1940ms 5.1550 KOps/s 5.0354 KOps/s $\color{#35bf28}+2.38\%$
test_compile_assign_and_add[pytree-eager] 2.2772ms 1.0890ms 918.2552 Ops/s 900.5238 Ops/s $\color{#35bf28}+1.97\%$
test_compile_assign_and_add_stack[compile] 0.7614ms 0.4238ms 2.3596 KOps/s 2.3926 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_assign_and_add_stack[eager] 5.8922ms 4.0792ms 245.1471 Ops/s 243.9153 Ops/s $\color{#35bf28}+0.51\%$
test_compile_indexing[tensor-tensordict-compile] 86.6310μs 35.1167μs 28.4765 KOps/s 28.9887 KOps/s $\color{#d91a1a}-1.77\%$
test_compile_indexing[tensor-tensordict-eager] 0.6502ms 49.0390μs 20.3919 KOps/s 20.7321 KOps/s $\color{#d91a1a}-1.64\%$
test_compile_indexing[tensor-tensorclass-compile] 84.0460μs 30.8449μs 32.4202 KOps/s 34.3956 KOps/s $\textbf{\color{#d91a1a}-5.74\%}$
test_compile_indexing[tensor-tensorclass-eager] 80.1190μs 29.4993μs 33.8991 KOps/s 35.6464 KOps/s $\color{#d91a1a}-4.90\%$
test_compile_indexing[tensor-pytree-compile] 92.8730μs 30.6776μs 32.5971 KOps/s 35.0269 KOps/s $\textbf{\color{#d91a1a}-6.94\%}$
test_compile_indexing[tensor-pytree-eager] 83.3760μs 29.1559μs 34.2984 KOps/s 35.3540 KOps/s $\color{#d91a1a}-2.99\%$
test_compile_indexing[slice-tensordict-compile] 0.1831ms 75.2918μs 13.2817 KOps/s 13.7183 KOps/s $\color{#d91a1a}-3.18\%$
test_compile_indexing[slice-tensordict-eager] 0.4707ms 27.6404μs 36.1789 KOps/s 35.7047 KOps/s $\color{#35bf28}+1.33\%$
test_compile_indexing[slice-tensorclass-compile] 0.1479ms 69.0636μs 14.4794 KOps/s 14.4761 KOps/s $\color{#35bf28}+0.02\%$
test_compile_indexing[slice-tensorclass-eager] 67.9960μs 23.0358μs 43.4106 KOps/s 42.0676 KOps/s $\color{#35bf28}+3.19\%$
test_compile_indexing[slice-pytree-compile] 0.1448ms 68.6744μs 14.5615 KOps/s 14.6405 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_indexing[slice-pytree-eager] 69.5500μs 22.4587μs 44.5261 KOps/s 41.8224 KOps/s $\textbf{\color{#35bf28}+6.46\%}$
test_compile_indexing[int-tensordict-compile] 0.1696ms 74.2201μs 13.4734 KOps/s 13.5974 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_indexing[int-tensordict-eager] 0.8665ms 27.8863μs 35.8599 KOps/s 36.6668 KOps/s $\color{#d91a1a}-2.20\%$
test_compile_indexing[int-tensorclass-compile] 0.1606ms 69.4711μs 14.3945 KOps/s 14.7371 KOps/s $\color{#d91a1a}-2.32\%$
test_compile_indexing[int-tensorclass-eager] 78.6870μs 22.9848μs 43.5070 KOps/s 43.0328 KOps/s $\color{#35bf28}+1.10\%$
test_compile_indexing[int-pytree-compile] 0.1731ms 69.7711μs 14.3326 KOps/s 14.7068 KOps/s $\color{#d91a1a}-2.54\%$
test_compile_indexing[int-pytree-eager] 63.7690μs 22.7694μs 43.9186 KOps/s 43.2730 KOps/s $\color{#35bf28}+1.49\%$
test_mod_add[eager] 98.2540μs 27.4453μs 36.4361 KOps/s 38.7567 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_mod_add[compile] 0.1251ms 40.8816μs 24.4609 KOps/s 26.1284 KOps/s $\textbf{\color{#d91a1a}-6.38\%}$
test_mod_add[compile-overhead] 0.1087ms 39.8464μs 25.0964 KOps/s 26.4273 KOps/s $\textbf{\color{#d91a1a}-5.04\%}$
test_mod_wrap[eager] 0.4091ms 0.2117ms 4.7236 KOps/s 4.6189 KOps/s $\color{#35bf28}+2.27\%$
test_mod_wrap[compile] 0.4601ms 0.2389ms 4.1858 KOps/s 4.3058 KOps/s $\color{#d91a1a}-2.79\%$
test_mod_wrap[compile-overhead] 0.3184ms 0.2306ms 4.3363 KOps/s 4.3457 KOps/s $\color{#d91a1a}-0.22\%$
test_mod_wrap_and_backward[eager] 18.1342ms 11.4303ms 87.4870 Ops/s 90.2938 Ops/s $\color{#d91a1a}-3.11\%$
test_mod_wrap_and_backward[compile] 12.9822ms 11.2508ms 88.8826 Ops/s 82.6183 Ops/s $\textbf{\color{#35bf28}+7.58\%}$
test_mod_wrap_and_backward[compile-overhead] 20.5894ms 11.9029ms 84.0128 Ops/s 79.6477 Ops/s $\textbf{\color{#35bf28}+5.48\%}$
test_seq_add[eager] 0.1777ms 94.4210μs 10.5909 KOps/s 10.8725 KOps/s $\color{#d91a1a}-2.59\%$
test_seq_add[compile] 0.1738ms 65.5533μs 15.2548 KOps/s 15.1196 KOps/s $\color{#35bf28}+0.89\%$
test_seq_add[compile-overhead] 0.1255ms 64.1788μs 15.5815 KOps/s 15.6294 KOps/s $\color{#d91a1a}-0.31\%$
test_seq_wrap[eager] 0.5088ms 0.3905ms 2.5610 KOps/s 2.5482 KOps/s $\color{#35bf28}+0.50\%$
test_seq_wrap[compile] 1.1726ms 0.2725ms 3.6703 KOps/s 3.6823 KOps/s $\color{#d91a1a}-0.33\%$
test_seq_wrap[compile-overhead] 1.2923ms 0.2719ms 3.6774 KOps/s 3.7221 KOps/s $\color{#d91a1a}-1.20\%$
test_func_call_runtime[False-eager] 1.2136ms 0.5249ms 1.9050 KOps/s 1.8915 KOps/s $\color{#35bf28}+0.71\%$
test_func_call_runtime[False-compile] 0.6212ms 0.5016ms 1.9936 KOps/s 1.9838 KOps/s $\color{#35bf28}+0.49\%$
test_func_call_runtime[False-compile-overhead] 0.9442ms 0.5026ms 1.9897 KOps/s 1.9491 KOps/s $\color{#35bf28}+2.09\%$
test_func_call_runtime[True-eager] 0.9742ms 0.7370ms 1.3569 KOps/s 1.3168 KOps/s $\color{#35bf28}+3.04\%$
test_func_call_runtime[True-compile] 1.0665ms 0.5166ms 1.9356 KOps/s 1.9358 KOps/s $-0.01\%$
test_func_call_runtime[True-compile-overhead] 0.8900ms 0.5182ms 1.9299 KOps/s 1.9278 KOps/s $\color{#35bf28}+0.11\%$
test_func_call_cm_runtime[False-eager] 0.6635ms 0.5207ms 1.9205 KOps/s 1.8757 KOps/s $\color{#35bf28}+2.39\%$
test_func_call_cm_runtime[False-compile] 0.6250ms 0.5024ms 1.9906 KOps/s 1.9928 KOps/s $\color{#d91a1a}-0.11\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7084ms 0.5025ms 1.9902 KOps/s 2.0008 KOps/s $\color{#d91a1a}-0.53\%$
test_func_call_cm_runtime[True-eager] 1.1428ms 0.8940ms 1.1186 KOps/s 1.1142 KOps/s $\color{#35bf28}+0.40\%$
test_func_call_cm_runtime[True-compile] 0.8341ms 0.7277ms 1.3741 KOps/s 1.3176 KOps/s $\color{#35bf28}+4.28\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8267ms 0.7278ms 1.3739 KOps/s 1.3057 KOps/s $\textbf{\color{#35bf28}+5.22\%}$
test_vmap_func_call_cm_runtime[eager] 2.5954ms 1.9085ms 523.9823 Ops/s 513.6400 Ops/s $\color{#35bf28}+2.01\%$
test_vmap_func_call_cm_runtime[compile] 2.8703ms 1.9824ms 504.4494 Ops/s 503.4518 Ops/s $\color{#35bf28}+0.20\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.8193ms 1.9782ms 505.5135 Ops/s 497.4608 Ops/s $\color{#35bf28}+1.62\%$
test_distributed 1.2800ms 0.1254ms 7.9776 KOps/s 7.8513 KOps/s $\color{#35bf28}+1.61\%$
test_tdmodule 37.1090μs 19.0915μs 52.3794 KOps/s 55.3453 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_tdmodule_dispatch 77.5450μs 38.2396μs 26.1509 KOps/s 27.3383 KOps/s $\color{#d91a1a}-4.34\%$
test_tdseq 55.2030μs 22.6183μs 44.2120 KOps/s 47.2381 KOps/s $\textbf{\color{#d91a1a}-6.41\%}$
test_tdseq_dispatch 81.4220μs 43.5889μs 22.9416 KOps/s 22.6581 KOps/s $\color{#35bf28}+1.25\%$
test_instantiation_functorch 2.0902ms 1.5742ms 635.2373 Ops/s 618.8433 Ops/s $\color{#35bf28}+2.65\%$
test_instantiation_td 2.0978ms 1.1829ms 845.3772 Ops/s 839.5183 Ops/s $\color{#35bf28}+0.70\%$
test_exec_functorch 0.3321ms 0.1858ms 5.3811 KOps/s 5.3246 KOps/s $\color{#35bf28}+1.06\%$
test_exec_functional_call 0.4228ms 0.1777ms 5.6263 KOps/s 5.6700 KOps/s $\color{#d91a1a}-0.77\%$
test_exec_td 0.3453ms 0.2025ms 4.9384 KOps/s 4.9148 KOps/s $\color{#35bf28}+0.48\%$
test_exec_td_decorator 0.4370ms 0.2335ms 4.2830 KOps/s 4.1311 KOps/s $\color{#35bf28}+3.68\%$
test_vmap_mlp_speed[True-True] 1.1484ms 0.7017ms 1.4251 KOps/s 1.4389 KOps/s $\color{#d91a1a}-0.96\%$
test_vmap_mlp_speed[True-False] 1.0829ms 0.6940ms 1.4409 KOps/s 1.4628 KOps/s $\color{#d91a1a}-1.50\%$
test_vmap_mlp_speed[False-True] 0.8396ms 0.5446ms 1.8363 KOps/s 1.8722 KOps/s $\color{#d91a1a}-1.91\%$
test_vmap_mlp_speed[False-False] 0.7726ms 0.5418ms 1.8456 KOps/s 1.8618 KOps/s $\color{#d91a1a}-0.87\%$
test_vmap_mlp_speed_decorator[True-True] 1.7103ms 0.6547ms 1.5274 KOps/s 1.5425 KOps/s $\color{#d91a1a}-0.98\%$
test_vmap_mlp_speed_decorator[True-False] 0.9149ms 0.6585ms 1.5185 KOps/s 1.5610 KOps/s $\color{#d91a1a}-2.73\%$
test_vmap_mlp_speed_decorator[False-True] 0.8546ms 0.5437ms 1.8391 KOps/s 1.8895 KOps/s $\color{#d91a1a}-2.66\%$
test_vmap_mlp_speed_decorator[False-False] 0.7897ms 0.5414ms 1.8471 KOps/s 1.8632 KOps/s $\color{#d91a1a}-0.87\%$
test_to_module_speed[True] 2.2808ms 1.4499ms 689.7108 Ops/s 718.5362 Ops/s $\color{#d91a1a}-4.01\%$
test_to_module_speed[False] 2.1745ms 1.4010ms 713.7868 Ops/s 720.6253 Ops/s $\color{#d91a1a}-0.95\%$
test_tc_init 83.2650μs 47.6458μs 20.9882 KOps/s 21.4992 KOps/s $\color{#d91a1a}-2.38\%$
test_tc_init_nested 0.1756ms 94.5411μs 10.5774 KOps/s 10.6038 KOps/s $\color{#d91a1a}-0.25\%$
test_tc_first_layer_tensor 39.3540μs 1.5077μs 663.2654 KOps/s 652.8907 KOps/s $\color{#35bf28}+1.59\%$
test_tc_first_layer_nontensor 30.8770μs 4.7269μs 211.5540 KOps/s 211.4870 KOps/s $\color{#35bf28}+0.03\%$
test_tc_second_layer_tensor 38.7320μs 2.8248μs 354.0030 KOps/s 352.1838 KOps/s $\color{#35bf28}+0.52\%$
test_tc_second_layer_nontensor 44.4770μs 5.9916μs 166.9013 KOps/s 166.1793 KOps/s $\color{#35bf28}+0.43\%$
test_unbind 0.5059s 13.3103ms 75.1299 Ops/s 73.5796 Ops/s $\color{#35bf28}+2.11\%$
test_full_like 9.7996ms 8.7347ms 114.4856 Ops/s 78.9457 Ops/s $\textbf{\color{#35bf28}+45.02\%}$
test_zeros_like 4.0593ms 3.3310ms 300.2085 Ops/s 148.6452 Ops/s $\textbf{\color{#35bf28}+101.96\%}$
test_ones_like 4.9487ms 3.8708ms 258.3457 Ops/s 127.1683 Ops/s $\textbf{\color{#35bf28}+103.15\%}$
test_clone 8.6125ms 6.4309ms 155.4988 Ops/s 100.2305 Ops/s $\textbf{\color{#35bf28}+55.14\%}$
test_squeeze 70.7730μs 12.8876μs 77.5940 KOps/s 82.3584 KOps/s $\textbf{\color{#d91a1a}-5.79\%}$
test_unsqueeze 0.1537ms 94.0566μs 10.6319 KOps/s 10.7513 KOps/s $\color{#d91a1a}-1.11\%$
test_split 0.6148ms 0.1993ms 5.0166 KOps/s 5.0330 KOps/s $\color{#d91a1a}-0.33\%$
test_permute 0.3685ms 0.2240ms 4.4634 KOps/s 4.3836 KOps/s $\color{#35bf28}+1.82\%$
test_stack 35.7369ms 28.0963ms 35.5919 Ops/s 37.7997 Ops/s $\textbf{\color{#d91a1a}-5.84\%}$
test_cat 31.9578ms 27.0063ms 37.0284 Ops/s 39.2863 Ops/s $\textbf{\color{#d91a1a}-5.75\%}$

Copy link

github-actions bot commented Oct 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1574ms 16.5035μs 60.5933 KOps/s 60.4300 KOps/s $\color{#35bf28}+0.27\%$
test_plain_set_stack_nested 51.2330μs 16.5163μs 60.5462 KOps/s 60.1937 KOps/s $\color{#35bf28}+0.59\%$
test_plain_set_nested_inplace 58.5630μs 18.0171μs 55.5029 KOps/s 57.2009 KOps/s $\color{#d91a1a}-2.97\%$
test_plain_set_stack_nested_inplace 55.2740μs 17.6303μs 56.7206 KOps/s 56.8657 KOps/s $\color{#d91a1a}-0.26\%$
test_items 38.3720μs 2.8842μs 346.7115 KOps/s 346.6003 KOps/s $\color{#35bf28}+0.03\%$
test_items_nested 0.3927ms 0.3427ms 2.9180 KOps/s 2.9787 KOps/s $\color{#d91a1a}-2.04\%$
test_items_nested_locked 0.4399ms 0.3436ms 2.9106 KOps/s 2.9522 KOps/s $\color{#d91a1a}-1.41\%$
test_items_nested_leaf 92.5360μs 62.8090μs 15.9213 KOps/s 15.9128 KOps/s $\color{#35bf28}+0.05\%$
test_items_stack_nested 0.5212ms 0.3443ms 2.9047 KOps/s 2.9635 KOps/s $\color{#d91a1a}-1.98\%$
test_items_stack_nested_leaf 0.1168ms 63.8798μs 15.6544 KOps/s 15.7399 KOps/s $\color{#d91a1a}-0.54\%$
test_items_stack_nested_locked 0.3969ms 0.3423ms 2.9210 KOps/s 2.9486 KOps/s $\color{#d91a1a}-0.94\%$
test_keys 36.2820μs 3.4704μs 288.1492 KOps/s 290.8778 KOps/s $\color{#d91a1a}-0.94\%$
test_keys_nested 0.1586ms 70.6595μs 14.1524 KOps/s 14.0699 KOps/s $\color{#35bf28}+0.59\%$
test_keys_nested_locked 2.2858ms 77.1024μs 12.9698 KOps/s 12.9642 KOps/s $\color{#35bf28}+0.04\%$
test_keys_nested_leaf 98.2160μs 61.6988μs 16.2078 KOps/s 15.9574 KOps/s $\color{#35bf28}+1.57\%$
test_keys_stack_nested 0.1082ms 71.7801μs 13.9314 KOps/s 13.8388 KOps/s $\color{#35bf28}+0.67\%$
test_keys_stack_nested_leaf 0.1001ms 62.5616μs 15.9842 KOps/s 15.7766 KOps/s $\color{#35bf28}+1.32\%$
test_keys_stack_nested_locked 0.1257ms 78.4122μs 12.7531 KOps/s 12.8192 KOps/s $\color{#d91a1a}-0.52\%$
test_values 7.3822μs 0.8732μs 1.1452 MOps/s 1.1690 MOps/s $\color{#d91a1a}-2.04\%$
test_values_nested 0.1084ms 48.6051μs 20.5740 KOps/s 20.6216 KOps/s $\color{#d91a1a}-0.23\%$
test_values_nested_locked 0.1313ms 50.9203μs 19.6385 KOps/s 19.9324 KOps/s $\color{#d91a1a}-1.47\%$
test_values_nested_leaf 72.8650μs 42.7116μs 23.4129 KOps/s 23.5625 KOps/s $\color{#d91a1a}-0.64\%$
test_values_stack_nested 75.5950μs 49.5490μs 20.1820 KOps/s 20.2433 KOps/s $\color{#d91a1a}-0.30\%$
test_values_stack_nested_leaf 0.1852ms 43.4629μs 23.0082 KOps/s 22.8352 KOps/s $\color{#35bf28}+0.76\%$
test_values_stack_nested_locked 0.2081ms 51.5410μs 19.4020 KOps/s 19.4863 KOps/s $\color{#d91a1a}-0.43\%$
test_membership 1.7511μs 0.5005μs 1.9979 MOps/s 1.9881 MOps/s $\color{#35bf28}+0.49\%$
test_membership_nested 15.5660μs 1.9034μs 525.3891 KOps/s 534.3527 KOps/s $\color{#d91a1a}-1.68\%$
test_membership_nested_leaf 15.0810μs 1.8817μs 531.4382 KOps/s 531.3718 KOps/s $\color{#35bf28}+0.01\%$
test_membership_stacked_nested 27.8720μs 1.9361μs 516.5069 KOps/s 523.3216 KOps/s $\color{#d91a1a}-1.30\%$
test_membership_stacked_nested_leaf 28.0320μs 1.9258μs 519.2764 KOps/s 525.4838 KOps/s $\color{#d91a1a}-1.18\%$
test_membership_nested_last 36.2420μs 3.0027μs 333.0342 KOps/s 336.7693 KOps/s $\color{#d91a1a}-1.11\%$
test_membership_nested_leaf_last 27.3720μs 3.0093μs 332.3084 KOps/s 336.5411 KOps/s $\color{#d91a1a}-1.26\%$
test_membership_stacked_nested_last 27.4420μs 3.0045μs 332.8362 KOps/s 335.8854 KOps/s $\color{#d91a1a}-0.91\%$
test_membership_stacked_nested_leaf_last 30.3620μs 2.9736μs 336.2918 KOps/s 337.8819 KOps/s $\color{#d91a1a}-0.47\%$
test_nested_getleaf 40.6830μs 6.0761μs 164.5779 KOps/s 164.5934 KOps/s $-0.01\%$
test_nested_get 52.4530μs 5.8024μs 172.3417 KOps/s 175.8022 KOps/s $\color{#d91a1a}-1.97\%$
test_stacked_getleaf 32.3820μs 6.0391μs 165.5867 KOps/s 165.1358 KOps/s $\color{#35bf28}+0.27\%$
test_stacked_get 38.8320μs 5.7046μs 175.2985 KOps/s 176.6232 KOps/s $\color{#d91a1a}-0.75\%$
test_nested_getitemleaf 71.4350μs 6.1201μs 163.3952 KOps/s 161.7896 KOps/s $\color{#35bf28}+0.99\%$
test_nested_getitem 36.3820μs 5.8524μs 170.8700 KOps/s 173.6520 KOps/s $\color{#d91a1a}-1.60\%$
test_stacked_getitemleaf 43.6430μs 6.1477μs 162.6615 KOps/s 164.6066 KOps/s $\color{#d91a1a}-1.18\%$
test_stacked_getitem 31.5120μs 5.7719μs 173.2518 KOps/s 175.1236 KOps/s $\color{#d91a1a}-1.07\%$
test_lock_nested 4.6138ms 0.4323ms 2.3132 KOps/s 2.3310 KOps/s $\color{#d91a1a}-0.76\%$
test_lock_stack_nested 0.5956ms 0.3975ms 2.5154 KOps/s 2.5384 KOps/s $\color{#d91a1a}-0.91\%$
test_unlock_nested 0.7655ms 0.3655ms 2.7362 KOps/s 2.7637 KOps/s $\color{#d91a1a}-0.99\%$
test_unlock_stack_nested 0.3783ms 0.3335ms 2.9982 KOps/s 3.0008 KOps/s $\color{#d91a1a}-0.09\%$
test_flatten_speed 0.1298ms 76.9179μs 13.0009 KOps/s 13.0597 KOps/s $\color{#d91a1a}-0.45\%$
test_unflatten_speed 0.3671ms 0.3204ms 3.1210 KOps/s 3.1084 KOps/s $\color{#35bf28}+0.41\%$
test_common_ops 1.5268ms 1.2389ms 807.1687 Ops/s 798.6372 Ops/s $\color{#35bf28}+1.07\%$
test_creation 33.6120μs 1.4831μs 674.2700 KOps/s 666.8217 KOps/s $\color{#35bf28}+1.12\%$
test_creation_empty 45.7330μs 15.0754μs 66.3332 KOps/s 65.5202 KOps/s $\color{#35bf28}+1.24\%$
test_creation_nested_1 44.9730μs 17.4614μs 57.2692 KOps/s 59.1297 KOps/s $\color{#d91a1a}-3.15\%$
test_creation_nested_2 81.8050μs 19.5703μs 51.0978 KOps/s 51.1729 KOps/s $\color{#d91a1a}-0.15\%$
test_clone 0.1990ms 28.2926μs 35.3449 KOps/s 35.9490 KOps/s $\color{#d91a1a}-1.68\%$
test_getitem[int] 96.4474ms 24.3319μs 41.0984 KOps/s 59.7055 KOps/s $\textbf{\color{#d91a1a}-31.16\%}$
test_getitem[slice_int] 0.1171ms 27.5460μs 36.3029 KOps/s 35.4500 KOps/s $\color{#35bf28}+2.41\%$
test_getitem[range] 0.2239ms 0.1075ms 9.3010 KOps/s 9.0819 KOps/s $\color{#35bf28}+2.41\%$
test_getitem[tuple] 0.1988ms 23.9279μs 41.7922 KOps/s 41.3427 KOps/s $\color{#35bf28}+1.09\%$
test_getitem[list] 0.2533ms 96.9149μs 10.3183 KOps/s 10.3381 KOps/s $\color{#d91a1a}-0.19\%$
test_setitem_dim[int] 60.3740μs 43.8625μs 22.7985 KOps/s 22.2879 KOps/s $\color{#35bf28}+2.29\%$
test_setitem_dim[slice_int] 94.6060μs 66.6385μs 15.0063 KOps/s 14.7300 KOps/s $\color{#35bf28}+1.88\%$
test_setitem_dim[range] 0.2295ms 0.1257ms 7.9575 KOps/s 7.8848 KOps/s $\color{#35bf28}+0.92\%$
test_setitem_dim[tuple] 84.0850μs 60.8776μs 16.4264 KOps/s 16.2911 KOps/s $\color{#35bf28}+0.83\%$
test_setitem 0.1882ms 41.2275μs 24.2557 KOps/s 24.1939 KOps/s $\color{#35bf28}+0.26\%$
test_set 0.1915ms 40.0987μs 24.9385 KOps/s 25.0719 KOps/s $\color{#d91a1a}-0.53\%$
test_set_shared 0.3568ms 53.0335μs 18.8560 KOps/s 19.1163 KOps/s $\color{#d91a1a}-1.36\%$
test_update 0.2477ms 49.4928μs 20.2050 KOps/s 20.2311 KOps/s $\color{#d91a1a}-0.13\%$
test_update_nested 0.2064ms 56.2155μs 17.7887 KOps/s 17.5493 KOps/s $\color{#35bf28}+1.36\%$
test_update__nested 0.2101ms 59.8458μs 16.7096 KOps/s 16.4026 KOps/s $\color{#35bf28}+1.87\%$
test_set_nested 0.1987ms 42.3409μs 23.6178 KOps/s 23.2307 KOps/s $\color{#35bf28}+1.67\%$
test_set_nested_new 0.1974ms 45.8563μs 21.8072 KOps/s 21.3626 KOps/s $\color{#35bf28}+2.08\%$
test_select 0.2445ms 59.1663μs 16.9015 KOps/s 16.6819 KOps/s $\color{#35bf28}+1.32\%$
test_select_nested 0.3146ms 41.9903μs 23.8150 KOps/s 24.2433 KOps/s $\color{#d91a1a}-1.77\%$
test_exclude_nested 0.2142ms 58.5848μs 17.0693 KOps/s 16.9499 KOps/s $\color{#35bf28}+0.70\%$
test_empty[True] 0.2947ms 0.2576ms 3.8813 KOps/s 3.8257 KOps/s $\color{#35bf28}+1.45\%$
test_empty[False] 3.2072μs 0.7463μs 1.3400 MOps/s 1.3363 MOps/s $\color{#35bf28}+0.28\%$
test_to 0.1362ms 26.9491μs 37.1070 KOps/s 38.4378 KOps/s $\color{#d91a1a}-3.46\%$
test_to_nonblocking 0.1627ms 25.7806μs 38.7889 KOps/s 39.5012 KOps/s $\color{#d91a1a}-1.80\%$
test_unbind_speed 1.6495ms 0.2843ms 3.5175 KOps/s 3.5366 KOps/s $\color{#d91a1a}-0.54\%$
test_unbind_speed_stack0 0.3449ms 0.2814ms 3.5537 KOps/s 3.5675 KOps/s $\color{#d91a1a}-0.39\%$
test_unbind_speed_stack1 95.3655ms 0.7145ms 1.3995 KOps/s 1.3949 KOps/s $\color{#35bf28}+0.33\%$
test_split 97.8299ms 2.1675ms 461.3667 Ops/s 446.7551 Ops/s $\color{#35bf28}+3.27\%$
test_chunk 96.9350ms 2.1671ms 461.4559 Ops/s 447.1218 Ops/s $\color{#35bf28}+3.21\%$
test_creation[device0] 0.2829ms 0.1280ms 7.8105 KOps/s 7.8200 KOps/s $\color{#d91a1a}-0.12\%$
test_creation_from_tensor 0.3916ms 0.1301ms 7.6860 KOps/s 7.7163 KOps/s $\color{#d91a1a}-0.39\%$
test_add_one[memmap_tensor0] 0.2387ms 9.1080μs 109.7933 KOps/s 110.5957 KOps/s $\color{#d91a1a}-0.73\%$
test_contiguous[memmap_tensor0] 39.1820μs 2.2408μs 446.2677 KOps/s 449.2236 KOps/s $\color{#d91a1a}-0.66\%$
test_stack[memmap_tensor0] 35.5920μs 6.6982μs 149.2938 KOps/s 148.2179 KOps/s $\color{#35bf28}+0.73\%$
test_memmaptd_index 1.0948ms 0.4267ms 2.3437 KOps/s 2.2946 KOps/s $\color{#35bf28}+2.14\%$
test_memmaptd_index_astensor 1.0387ms 0.4976ms 2.0095 KOps/s 2.0061 KOps/s $\color{#35bf28}+0.17\%$
test_memmaptd_index_op 1.4528ms 1.0333ms 967.7491 Ops/s 958.6338 Ops/s $\color{#35bf28}+0.95\%$
test_serialize_model 0.1315s 0.1302s 7.6781 Ops/s 7.6437 Ops/s $\color{#35bf28}+0.45\%$
test_serialize_model_pickle 1.3596s 1.2166s 0.8219 Ops/s 0.8217 Ops/s $\color{#35bf28}+0.03\%$
test_serialize_weights 0.2293s 0.1442s 6.9349 Ops/s 6.9388 Ops/s $\color{#d91a1a}-0.06\%$
test_serialize_weights_returnearly 0.2300s 56.3455ms 17.7476 Ops/s 17.7640 Ops/s $\color{#d91a1a}-0.09\%$
test_serialize_weights_pickle 1.3740s 1.2174s 0.8214 Ops/s 0.8203 Ops/s $\color{#35bf28}+0.13\%$
test_reshape_pytree 0.1723ms 35.7485μs 27.9732 KOps/s 28.7097 KOps/s $\color{#d91a1a}-2.57\%$
test_reshape_td 0.1610ms 41.9335μs 23.8473 KOps/s 24.0061 KOps/s $\color{#d91a1a}-0.66\%$
test_view_pytree 0.1664ms 34.6919μs 28.8252 KOps/s 28.8179 KOps/s $\color{#35bf28}+0.03\%$
test_view_td 0.1808ms 45.4267μs 22.0135 KOps/s 21.7021 KOps/s $\color{#35bf28}+1.43\%$
test_unbind_pytree 0.1018ms 34.3050μs 29.1503 KOps/s 28.9452 KOps/s $\color{#35bf28}+0.71\%$
test_unbind_td 0.5295ms 43.2058μs 23.1451 KOps/s 23.4760 KOps/s $\color{#d91a1a}-1.41\%$
test_split_pytree 0.5282ms 45.4002μs 22.0263 KOps/s 22.0306 KOps/s $\color{#d91a1a}-0.02\%$
test_split_td 0.1996ms 55.2410μs 18.1025 KOps/s 14.9312 KOps/s $\textbf{\color{#35bf28}+21.24\%}$
test_add_pytree 0.2137ms 56.4759μs 17.7067 KOps/s 17.0981 KOps/s $\color{#35bf28}+3.56\%$
test_add_td 0.2538ms 94.4780μs 10.5845 KOps/s 10.8259 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_add_one_nested[tensordict-compile] 0.3043ms 0.1594ms 6.2748 KOps/s 6.2506 KOps/s $\color{#35bf28}+0.39\%$
test_compile_add_one_nested[tensordict-eager] 0.3961ms 0.1638ms 6.1049 KOps/s 6.2875 KOps/s $\color{#d91a1a}-2.90\%$
test_compile_add_one_nested[pytree-compile] 0.3106ms 0.1431ms 6.9889 KOps/s 6.9319 KOps/s $\color{#35bf28}+0.82\%$
test_compile_add_one_nested[pytree-eager] 0.5844ms 0.1813ms 5.5150 KOps/s 5.5300 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_copy_nested[tensordict-compile] 0.1370ms 21.4800μs 46.5550 KOps/s 46.4625 KOps/s $\color{#35bf28}+0.20\%$
test_compile_copy_nested[tensordict-eager] 0.4504ms 49.2004μs 20.3251 KOps/s 19.8286 KOps/s $\color{#35bf28}+2.50\%$
test_compile_copy_nested[pytree-compile] 0.4618ms 63.9861μs 15.6284 KOps/s 15.5711 KOps/s $\color{#35bf28}+0.37\%$
test_compile_copy_nested[pytree-eager] 0.4558ms 49.2593μs 20.3007 KOps/s 20.3491 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_add_one_flat[tensordict-compile] 0.4424ms 0.3209ms 3.1158 KOps/s 3.0514 KOps/s $\color{#35bf28}+2.11\%$
test_compile_add_one_flat[tensordict-eager] 0.3656ms 0.2349ms 4.2579 KOps/s 4.2334 KOps/s $\color{#35bf28}+0.58\%$
test_compile_add_one_flat[tensorclass-compile] 0.2733ms 0.1267ms 7.8913 KOps/s 7.7536 KOps/s $\color{#35bf28}+1.78\%$
test_compile_add_one_flat[tensorclass-eager] 0.2357ms 66.4283μs 15.0538 KOps/s 15.2105 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_add_one_flat[pytree-compile] 0.4554ms 0.3240ms 3.0860 KOps/s 3.1383 KOps/s $\color{#d91a1a}-1.67\%$
test_compile_add_one_flat[pytree-eager] 0.8190ms 0.6151ms 1.6258 KOps/s 1.6521 KOps/s $\color{#d91a1a}-1.59\%$
test_compile_add_self_flat[tensordict-eager] 0.4422ms 0.2883ms 3.4684 KOps/s 3.5769 KOps/s $\color{#d91a1a}-3.03\%$
test_compile_add_self_flat[tensordict-compile] 0.5273ms 0.3293ms 3.0366 KOps/s 3.1205 KOps/s $\color{#d91a1a}-2.69\%$
test_compile_add_self_flat[tensorclass-eager] 0.2527ms 80.7766μs 12.3798 KOps/s 13.3075 KOps/s $\textbf{\color{#d91a1a}-6.97\%}$
test_compile_add_self_flat[tensorclass-compile] 0.3008ms 0.1331ms 7.5158 KOps/s 7.7012 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_add_self_flat[pytree-eager] 0.7042ms 0.5156ms 1.9394 KOps/s 1.9768 KOps/s $\color{#d91a1a}-1.89\%$
test_compile_add_self_flat[pytree-compile] 0.4727ms 0.3238ms 3.0887 KOps/s 3.1158 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_copy_flat[tensordict-compile] 0.1563ms 19.0449μs 52.5076 KOps/s 51.6295 KOps/s $\color{#35bf28}+1.70\%$
test_compile_copy_flat[tensordict-eager] 0.1411ms 39.0448μs 25.6116 KOps/s 24.5470 KOps/s $\color{#35bf28}+4.34\%$
test_compile_copy_flat[pytree-compile] 0.2104ms 69.2620μs 14.4379 KOps/s 14.3039 KOps/s $\color{#35bf28}+0.94\%$
test_compile_copy_flat[pytree-eager] 0.1723ms 51.4381μs 19.4408 KOps/s 19.0918 KOps/s $\color{#35bf28}+1.83\%$
test_compile_assign_and_add[tensordict-compile] 2.3352ms 0.8202ms 1.2193 KOps/s 1.1167 KOps/s $\textbf{\color{#35bf28}+9.19\%}$
test_compile_assign_and_add[tensordict-eager] 3.4187ms 3.1614ms 316.3179 Ops/s 317.4796 Ops/s $\color{#d91a1a}-0.37\%$
test_compile_assign_and_add[pytree-compile] 2.3304ms 0.8220ms 1.2166 KOps/s 1.1259 KOps/s $\textbf{\color{#35bf28}+8.05\%}$
test_compile_assign_and_add[pytree-eager] 3.5491ms 3.1099ms 321.5580 Ops/s 320.5473 Ops/s $\color{#35bf28}+0.32\%$
test_compile_indexing[tensor-tensordict-compile] 0.2620ms 0.1068ms 9.3595 KOps/s 9.0542 KOps/s $\color{#35bf28}+3.37\%$
test_compile_indexing[tensor-tensordict-eager] 0.2413ms 58.9026μs 16.9772 KOps/s 15.8150 KOps/s $\textbf{\color{#35bf28}+7.35\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.2831ms 0.1053ms 9.4934 KOps/s 9.6735 KOps/s $\color{#d91a1a}-1.86\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2396ms 44.7485μs 22.3471 KOps/s 23.5360 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_compile_indexing[tensor-pytree-compile] 0.3081ms 0.1073ms 9.3214 KOps/s 9.6390 KOps/s $\color{#d91a1a}-3.30\%$
test_compile_indexing[tensor-pytree-eager] 0.2271ms 42.3445μs 23.6158 KOps/s 23.3859 KOps/s $\color{#35bf28}+0.98\%$
test_compile_indexing[slice-tensordict-compile] 0.3026ms 0.1406ms 7.1123 KOps/s 7.3184 KOps/s $\color{#d91a1a}-2.82\%$
test_compile_indexing[slice-tensordict-eager] 0.1718ms 24.6355μs 40.5918 KOps/s 39.0941 KOps/s $\color{#35bf28}+3.83\%$
test_compile_indexing[slice-tensorclass-compile] 0.2835ms 0.1341ms 7.4555 KOps/s 7.6218 KOps/s $\color{#d91a1a}-2.18\%$
test_compile_indexing[slice-tensorclass-eager] 0.1216ms 20.5639μs 48.6290 KOps/s 46.9424 KOps/s $\color{#35bf28}+3.59\%$
test_compile_indexing[slice-pytree-compile] 0.2946ms 0.1345ms 7.4375 KOps/s 7.5935 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_indexing[slice-pytree-eager] 0.1192ms 22.0205μs 45.4122 KOps/s 46.8344 KOps/s $\color{#d91a1a}-3.04\%$
test_compile_indexing[int-tensordict-compile] 0.2904ms 0.1421ms 7.0394 KOps/s 7.1375 KOps/s $\color{#d91a1a}-1.37\%$
test_compile_indexing[int-tensordict-eager] 0.5046ms 24.1286μs 41.4446 KOps/s 39.3191 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_compile_indexing[int-tensorclass-compile] 0.2788ms 0.1307ms 7.6522 KOps/s 7.3059 KOps/s $\color{#35bf28}+4.74\%$
test_compile_indexing[int-tensorclass-eager] 0.1577ms 21.2602μs 47.0362 KOps/s 47.1006 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_indexing[int-pytree-compile] 0.2759ms 0.1361ms 7.3490 KOps/s 7.5783 KOps/s $\color{#d91a1a}-3.02\%$
test_compile_indexing[int-pytree-eager] 57.6140μs 20.8082μs 48.0579 KOps/s 46.8269 KOps/s $\color{#35bf28}+2.63\%$
test_mod_add[eager] 0.1935ms 32.6272μs 30.6493 KOps/s 31.4792 KOps/s $\color{#d91a1a}-2.64\%$
test_mod_add[compile] 0.3206ms 69.5365μs 14.3809 KOps/s 14.2451 KOps/s $\color{#35bf28}+0.95\%$
test_mod_add[compile-overhead] 0.2609ms 0.1345ms 7.4370 KOps/s 6.7847 KOps/s $\textbf{\color{#35bf28}+9.61\%}$
test_mod_wrap[eager] 0.9734ms 0.7689ms 1.3005 KOps/s 1.2862 KOps/s $\color{#35bf28}+1.11\%$
test_mod_wrap[compile] 2.2001ms 0.8294ms 1.2057 KOps/s 1.2045 KOps/s $\color{#35bf28}+0.09\%$
test_mod_wrap[compile-overhead] 4.8641ms 3.0801ms 324.6597 Ops/s 327.3484 Ops/s $\color{#d91a1a}-0.82\%$
test_mod_wrap_and_backward[eager] 4.1617ms 4.0242ms 248.4960 Ops/s 244.9290 Ops/s $\color{#35bf28}+1.46\%$
test_mod_wrap_and_backward[compile] 4.4421ms 3.9888ms 250.7043 Ops/s 245.5426 Ops/s $\color{#35bf28}+2.10\%$
test_mod_wrap_and_backward[compile-overhead] 1.3599ms 0.9014ms 1.1094 KOps/s 993.0840 Ops/s $\textbf{\color{#35bf28}+11.72\%}$
test_seq_add[eager] 0.4813ms 99.0435μs 10.0966 KOps/s 10.1996 KOps/s $\color{#d91a1a}-1.01\%$
test_seq_add[compile] 0.2283ms 82.3332μs 12.1458 KOps/s 12.2373 KOps/s $\color{#d91a1a}-0.75\%$
test_seq_add[compile-overhead] 0.2844ms 0.1121ms 8.9173 KOps/s 8.7710 KOps/s $\color{#35bf28}+1.67\%$
test_seq_wrap[eager] 1.0799ms 0.9208ms 1.0860 KOps/s 1.0753 KOps/s $\color{#35bf28}+0.99\%$
test_seq_wrap[compile] 1.2295ms 0.8468ms 1.1809 KOps/s 1.1790 KOps/s $\color{#35bf28}+0.16\%$
test_seq_wrap[compile-overhead] 0.3706ms 0.2183ms 4.5799 KOps/s 4.5001 KOps/s $\color{#35bf28}+1.77\%$
test_func_call_runtime[False-eager] 2.7477ms 2.3741ms 421.2076 Ops/s 432.2579 Ops/s $\color{#d91a1a}-2.56\%$
test_func_call_runtime[False-compile] 2.7935ms 2.3813ms 419.9449 Ops/s 422.2358 Ops/s $\color{#d91a1a}-0.54\%$
test_func_call_runtime[False-compile-overhead] 0.5202ms 0.3591ms 2.7850 KOps/s 2.7787 KOps/s $\color{#35bf28}+0.23\%$
test_func_call_runtime[True-eager] 2.9741ms 2.5399ms 393.7215 Ops/s 402.2854 Ops/s $\color{#d91a1a}-2.13\%$
test_func_call_runtime[True-compile] 2.8117ms 2.4081ms 415.2723 Ops/s 417.1377 Ops/s $\color{#d91a1a}-0.45\%$
test_func_call_runtime[True-compile-overhead] 0.7983ms 0.3835ms 2.6075 KOps/s 2.6350 KOps/s $\color{#d91a1a}-1.04\%$
test_func_call_cm_runtime[False-eager] 2.7552ms 2.3590ms 423.9024 Ops/s 433.9498 Ops/s $\color{#d91a1a}-2.32\%$
test_func_call_cm_runtime[False-compile] 2.8830ms 2.3950ms 417.5323 Ops/s 423.9553 Ops/s $\color{#d91a1a}-1.52\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5115ms 0.3633ms 2.7523 KOps/s 2.7732 KOps/s $\color{#d91a1a}-0.75\%$
test_func_call_cm_runtime[True-eager] 2.8431ms 2.6399ms 378.7979 Ops/s 383.6147 Ops/s $\color{#d91a1a}-1.26\%$
test_func_call_cm_runtime[True-compile] 2.6854ms 2.4431ms 409.3128 Ops/s 411.8526 Ops/s $\color{#d91a1a}-0.62\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5131ms 0.4090ms 2.4451 KOps/s 2.4597 KOps/s $\color{#d91a1a}-0.60\%$
test_vmap_func_call_cm_runtime[eager] 4.1812ms 3.7514ms 266.5670 Ops/s 267.3659 Ops/s $\color{#d91a1a}-0.30\%$
test_vmap_func_call_cm_runtime[compile] 2.5868ms 2.4454ms 408.9283 Ops/s 413.8238 Ops/s $\color{#d91a1a}-1.18\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5305ms 0.4116ms 2.4296 KOps/s 2.4442 KOps/s $\color{#d91a1a}-0.60\%$
test_distributed 1.9612ms 0.2202ms 4.5422 KOps/s 8.5399 KOps/s $\textbf{\color{#d91a1a}-46.81\%}$
test_tdmodule 0.2365ms 14.7124μs 67.9701 KOps/s 67.3288 KOps/s $\color{#35bf28}+0.95\%$
test_tdmodule_dispatch 50.7730μs 28.4415μs 35.1599 KOps/s 34.8380 KOps/s $\color{#35bf28}+0.92\%$
test_tdseq 44.9930μs 15.5117μs 64.4674 KOps/s 63.8603 KOps/s $\color{#35bf28}+0.95\%$
test_tdseq_dispatch 53.5740μs 31.0811μs 32.1739 KOps/s 31.3776 KOps/s $\color{#35bf28}+2.54\%$
test_instantiation_functorch 1.9983ms 1.8184ms 549.9437 Ops/s 536.4621 Ops/s $\color{#35bf28}+2.51\%$
test_instantiation_td 1.8349ms 1.1972ms 835.2845 Ops/s 821.8796 Ops/s $\color{#35bf28}+1.63\%$
test_exec_functorch 1.1604ms 0.9823ms 1.0180 KOps/s 1.0121 KOps/s $\color{#35bf28}+0.59\%$
test_exec_functional_call 1.1255ms 0.9968ms 1.0032 KOps/s 1.0085 KOps/s $\color{#d91a1a}-0.53\%$
test_exec_td 1.1633ms 1.0140ms 986.2229 Ops/s 974.4189 Ops/s $\color{#35bf28}+1.21\%$
test_exec_td_decorator 1.1834ms 1.0466ms 955.4635 Ops/s 955.7548 Ops/s $\color{#d91a1a}-0.03\%$
test_vmap_mlp_speed[True-True] 1.9655ms 1.2495ms 800.2906 Ops/s 796.0017 Ops/s $\color{#35bf28}+0.54\%$
test_vmap_mlp_speed[True-False] 1.4033ms 1.2444ms 803.5800 Ops/s 798.8533 Ops/s $\color{#35bf28}+0.59\%$
test_vmap_mlp_speed[False-True] 1.3402ms 1.1433ms 874.6630 Ops/s 873.0842 Ops/s $\color{#35bf28}+0.18\%$
test_vmap_mlp_speed[False-False] 1.2887ms 1.1417ms 875.8688 Ops/s 876.8408 Ops/s $\color{#d91a1a}-0.11\%$
test_vmap_mlp_speed_decorator[True-True] 1.4189ms 1.2243ms 816.7794 Ops/s 820.6733 Ops/s $\color{#d91a1a}-0.47\%$
test_vmap_mlp_speed_decorator[True-False] 1.7938ms 1.2299ms 813.0665 Ops/s 819.2043 Ops/s $\color{#d91a1a}-0.75\%$
test_vmap_mlp_speed_decorator[False-True] 1.2832ms 1.1425ms 875.2619 Ops/s 878.0181 Ops/s $\color{#d91a1a}-0.31\%$
test_vmap_mlp_speed_decorator[False-False] 1.3172ms 1.1417ms 875.9105 Ops/s 877.6248 Ops/s $\color{#d91a1a}-0.20\%$
test_vmap_transformer_speed[True-True] 13.3198ms 12.9795ms 77.0448 Ops/s 76.6919 Ops/s $\color{#35bf28}+0.46\%$
test_vmap_transformer_speed[True-False] 13.1228ms 12.9584ms 77.1698 Ops/s 76.8804 Ops/s $\color{#35bf28}+0.38\%$
test_vmap_transformer_speed[False-True] 12.9937ms 12.7863ms 78.2089 Ops/s 77.9203 Ops/s $\color{#35bf28}+0.37\%$
test_vmap_transformer_speed[False-False] 12.9204ms 12.7315ms 78.5451 Ops/s 77.8667 Ops/s $\color{#35bf28}+0.87\%$
test_vmap_transformer_speed_decorator[True-True] 33.6697ms 33.4591ms 29.8872 Ops/s 29.9268 Ops/s $\color{#d91a1a}-0.13\%$
test_vmap_transformer_speed_decorator[True-False] 33.7920ms 33.5517ms 29.8048 Ops/s 29.9202 Ops/s $\color{#d91a1a}-0.39\%$
test_vmap_transformer_speed_decorator[False-True] 33.4846ms 33.2878ms 30.0411 Ops/s 30.0079 Ops/s $\color{#35bf28}+0.11\%$
test_vmap_transformer_speed_decorator[False-False] 33.2702ms 33.1038ms 30.2080 Ops/s 29.9991 Ops/s $\color{#35bf28}+0.70\%$
test_to_module_speed[True] 1.3706ms 1.0045ms 995.5324 Ops/s 998.3117 Ops/s $\color{#d91a1a}-0.28\%$
test_to_module_speed[False] 1.3828ms 0.9762ms 1.0244 KOps/s 1.0297 KOps/s $\color{#d91a1a}-0.51\%$
test_tc_init 57.0540μs 33.1318μs 30.1824 KOps/s 30.2185 KOps/s $\color{#d91a1a}-0.12\%$
test_tc_init_nested 0.1083ms 71.5052μs 13.9850 KOps/s 14.5181 KOps/s $\color{#d91a1a}-3.67\%$
test_tc_first_layer_tensor 4.2603μs 0.6771μs 1.4768 MOps/s 1.4728 MOps/s $\color{#35bf28}+0.27\%$
test_tc_first_layer_nontensor 23.0910μs 2.2320μs 448.0270 KOps/s 441.9764 KOps/s $\color{#35bf28}+1.37\%$
test_tc_second_layer_tensor 33.0245μs 1.3831μs 723.0055 KOps/s 731.5392 KOps/s $\color{#d91a1a}-1.17\%$
test_tc_second_layer_nontensor 31.6020μs 2.9691μs 336.8074 KOps/s 342.1341 KOps/s $\color{#d91a1a}-1.56\%$
test_unbind 0.2046s 12.7220ms 78.6041 Ops/s 88.2514 Ops/s $\textbf{\color{#d91a1a}-10.93\%}$
test_full_like 0.7747ms 0.5771ms 1.7328 KOps/s 1.7384 KOps/s $\color{#d91a1a}-0.32\%$
test_zeros_like 0.3881ms 0.1981ms 5.0471 KOps/s 5.0434 KOps/s $\color{#35bf28}+0.07\%$
test_ones_like 0.3589ms 0.1979ms 5.0519 KOps/s 5.0503 KOps/s $\color{#35bf28}+0.03\%$
test_clone 0.5760ms 0.4140ms 2.4152 KOps/s 2.4146 KOps/s $\color{#35bf28}+0.03\%$
test_squeeze 0.1232ms 9.9328μs 100.6763 KOps/s 102.1743 KOps/s $\color{#d91a1a}-1.47\%$
test_unsqueeze 0.2525ms 73.7529μs 13.5588 KOps/s 13.4526 KOps/s $\color{#35bf28}+0.79\%$
test_split 0.4186ms 0.1600ms 6.2519 KOps/s 6.4251 KOps/s $\color{#d91a1a}-2.70\%$
test_permute 0.2990ms 0.1759ms 5.6865 KOps/s 5.6218 KOps/s $\color{#35bf28}+1.15\%$
test_stack 1.3734ms 0.8738ms 1.1444 KOps/s 1.1713 KOps/s $\color{#d91a1a}-2.30\%$
test_cat 1.3833ms 1.2314ms 812.0840 Ops/s 811.7786 Ops/s $\color{#35bf28}+0.04\%$

[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 5027496 into gh/vmoens/20/base Oct 3, 2024
7 of 8 checks passed
vmoens added a commit that referenced this pull request Oct 3, 2024
ghstack-source-id: a5fc71cfb1366db7cbb5f4353d70f3da90be814d
Pull Request resolved: #1019
@vmoens vmoens deleted the gh/vmoens/20/head branch October 3, 2024 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants