Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TensorDict.clamp #1165

Merged
merged 1 commit into from
Jan 7, 2025
Merged

[Feature] TensorDict.clamp #1165

merged 1 commit into from
Jan 7, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 7, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 7, 2025
ghstack-source-id: 44f0937c195d969055de10709402af7c4473df32
Pull Request resolved: #1165
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 7, 2025
Copy link

github-actions bot commented Jan 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}22$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 54.8520μs 22.6376μs 44.1742 KOps/s 45.9021 KOps/s $\color{#d91a1a}-3.76\%$
test_plain_set_stack_nested 50.7150μs 23.0366μs 43.4092 KOps/s 45.0333 KOps/s $\color{#d91a1a}-3.61\%$
test_plain_set_nested_inplace 73.4060μs 24.5438μs 40.7435 KOps/s 41.0886 KOps/s $\color{#d91a1a}-0.84\%$
test_plain_set_stack_nested_inplace 70.6220μs 24.5562μs 40.7229 KOps/s 41.1509 KOps/s $\color{#d91a1a}-1.04\%$
test_items 39.4530μs 4.1725μs 239.6636 KOps/s 237.6285 KOps/s $\color{#35bf28}+0.86\%$
test_items_nested 0.7233ms 0.4045ms 2.4720 KOps/s 2.4687 KOps/s $\color{#35bf28}+0.13\%$
test_items_nested_locked 0.5849ms 0.4030ms 2.4811 KOps/s 2.4735 KOps/s $\color{#35bf28}+0.31\%$
test_items_nested_leaf 0.1353ms 77.8907μs 12.8385 KOps/s 12.9253 KOps/s $\color{#d91a1a}-0.67\%$
test_items_stack_nested 0.7406ms 0.4066ms 2.4596 KOps/s 2.4551 KOps/s $\color{#35bf28}+0.18\%$
test_items_stack_nested_leaf 0.1384ms 80.2497μs 12.4611 KOps/s 12.4964 KOps/s $\color{#d91a1a}-0.28\%$
test_items_stack_nested_locked 0.6059ms 0.4029ms 2.4821 KOps/s 2.4283 KOps/s $\color{#35bf28}+2.22\%$
test_keys 47.6990μs 3.4846μs 286.9778 KOps/s 282.5891 KOps/s $\color{#35bf28}+1.55\%$
test_keys_nested 0.2550ms 0.1691ms 5.9149 KOps/s 5.8073 KOps/s $\color{#35bf28}+1.85\%$
test_keys_nested_locked 1.7928ms 0.1795ms 5.5697 KOps/s 5.6335 KOps/s $\color{#d91a1a}-1.13\%$
test_keys_nested_leaf 0.2355ms 0.1485ms 6.7320 KOps/s 6.6681 KOps/s $\color{#35bf28}+0.96\%$
test_keys_stack_nested 0.3331ms 0.1675ms 5.9715 KOps/s 5.9650 KOps/s $\color{#35bf28}+0.11\%$
test_keys_stack_nested_leaf 0.5040ms 0.1438ms 6.9561 KOps/s 6.9300 KOps/s $\color{#35bf28}+0.38\%$
test_keys_stack_nested_locked 0.3705ms 0.1718ms 5.8204 KOps/s 5.7393 KOps/s $\color{#35bf28}+1.41\%$
test_values 10.2872μs 1.0729μs 932.0876 KOps/s 969.6323 KOps/s $\color{#d91a1a}-3.87\%$
test_values_nested 0.1223ms 66.9701μs 14.9320 KOps/s 15.3467 KOps/s $\color{#d91a1a}-2.70\%$
test_values_nested_locked 0.1196ms 66.8310μs 14.9631 KOps/s 15.3422 KOps/s $\color{#d91a1a}-2.47\%$
test_values_nested_leaf 0.1254ms 75.4501μs 13.2538 KOps/s 13.2948 KOps/s $\color{#d91a1a}-0.31\%$
test_values_stack_nested 0.1233ms 67.0847μs 14.9065 KOps/s 14.2282 KOps/s $\color{#35bf28}+4.77\%$
test_values_stack_nested_leaf 0.1346ms 75.9345μs 13.1693 KOps/s 13.4636 KOps/s $\color{#d91a1a}-2.19\%$
test_values_stack_nested_locked 0.1306ms 67.1668μs 14.8883 KOps/s 15.2652 KOps/s $\color{#d91a1a}-2.47\%$
test_membership 30.3370μs 0.8602μs 1.1626 MOps/s 1.1341 MOps/s $\color{#35bf28}+2.52\%$
test_membership_nested 26.5190μs 2.9636μs 337.4267 KOps/s 333.1692 KOps/s $\color{#35bf28}+1.28\%$
test_membership_nested_leaf 48.0090μs 2.9815μs 335.4041 KOps/s 332.1110 KOps/s $\color{#35bf28}+0.99\%$
test_membership_stacked_nested 26.2690μs 2.9591μs 337.9406 KOps/s 332.9517 KOps/s $\color{#35bf28}+1.50\%$
test_membership_stacked_nested_leaf 46.6780μs 2.9362μs 340.5747 KOps/s 328.4625 KOps/s $\color{#35bf28}+3.69\%$
test_membership_nested_last 19.4260μs 4.5173μs 221.3732 KOps/s 219.7243 KOps/s $\color{#35bf28}+0.75\%$
test_membership_nested_leaf_last 53.5100μs 4.5266μs 220.9178 KOps/s 216.5874 KOps/s $\color{#35bf28}+2.00\%$
test_membership_stacked_nested_last 43.0300μs 13.8436μs 72.2355 KOps/s 71.6465 KOps/s $\color{#35bf28}+0.82\%$
test_membership_stacked_nested_leaf_last 40.3250μs 13.7592μs 72.6788 KOps/s 72.2087 KOps/s $\color{#35bf28}+0.65\%$
test_nested_getleaf 63.6490μs 10.8600μs 92.0812 KOps/s 92.3742 KOps/s $\color{#d91a1a}-0.32\%$
test_nested_get 59.3810μs 10.2596μs 97.4697 KOps/s 97.6586 KOps/s $\color{#d91a1a}-0.19\%$
test_stacked_getleaf 38.7520μs 10.7212μs 93.2727 KOps/s 92.5403 KOps/s $\color{#35bf28}+0.79\%$
test_stacked_get 56.4550μs 10.2912μs 97.1702 KOps/s 97.9260 KOps/s $\color{#d91a1a}-0.77\%$
test_nested_getitemleaf 48.3000μs 11.3819μs 87.8586 KOps/s 88.2396 KOps/s $\color{#d91a1a}-0.43\%$
test_nested_getitem 59.7420μs 10.5682μs 94.6237 KOps/s 95.0260 KOps/s $\color{#d91a1a}-0.42\%$
test_stacked_getitemleaf 37.4990μs 11.2283μs 89.0606 KOps/s 88.8484 KOps/s $\color{#35bf28}+0.24\%$
test_stacked_getitem 58.0280μs 10.3767μs 96.3700 KOps/s 96.8315 KOps/s $\color{#d91a1a}-0.48\%$
test_lock_nested 2.0126ms 0.4616ms 2.1662 KOps/s 1.7569 KOps/s $\textbf{\color{#35bf28}+23.30\%}$
test_lock_stack_nested 0.6975ms 0.4211ms 2.3745 KOps/s 2.3787 KOps/s $\color{#d91a1a}-0.18\%$
test_unlock_nested 0.8459ms 0.3760ms 2.6594 KOps/s 2.6348 KOps/s $\color{#35bf28}+0.94\%$
test_unlock_stack_nested 0.6408ms 0.3365ms 2.9719 KOps/s 2.9618 KOps/s $\color{#35bf28}+0.34\%$
test_flatten_speed 0.1886ms 0.1014ms 9.8582 KOps/s 9.8511 KOps/s $\color{#35bf28}+0.07\%$
test_unflatten_speed 0.6902ms 0.5318ms 1.8805 KOps/s 1.8502 KOps/s $\color{#35bf28}+1.64\%$
test_common_ops 1.5699ms 0.8126ms 1.2306 KOps/s 1.2787 KOps/s $\color{#d91a1a}-3.77\%$
test_creation 32.3300μs 2.5550μs 391.3920 KOps/s 400.9198 KOps/s $\color{#d91a1a}-2.38\%$
test_creation_empty 40.9060μs 12.7549μs 78.4011 KOps/s 90.5213 KOps/s $\textbf{\color{#d91a1a}-13.39\%}$
test_creation_nested_1 63.2880μs 16.1093μs 62.0760 KOps/s 71.1110 KOps/s $\textbf{\color{#d91a1a}-12.71\%}$
test_creation_nested_2 49.9640μs 21.1762μs 47.2228 KOps/s 52.6206 KOps/s $\textbf{\color{#d91a1a}-10.26\%}$
test_clone 0.1010ms 13.4895μs 74.1318 KOps/s 74.2176 KOps/s $\color{#d91a1a}-0.12\%$
test_getitem[int] 1.0989ms 13.0353μs 76.7148 KOps/s 76.8122 KOps/s $\color{#d91a1a}-0.13\%$
test_getitem[slice_int] 0.1422ms 24.6834μs 40.5131 KOps/s 40.6257 KOps/s $\color{#d91a1a}-0.28\%$
test_getitem[range] 0.1696ms 48.3337μs 20.6895 KOps/s 19.9322 KOps/s $\color{#35bf28}+3.80\%$
test_getitem[tuple] 0.1399ms 20.6268μs 48.4807 KOps/s 50.0095 KOps/s $\color{#d91a1a}-3.06\%$
test_getitem[list] 0.1911ms 43.7878μs 22.8374 KOps/s 22.6488 KOps/s $\color{#35bf28}+0.83\%$
test_setitem_dim[int] 43.3610μs 25.2059μs 39.6732 KOps/s 39.5766 KOps/s $\color{#35bf28}+0.24\%$
test_setitem_dim[slice_int] 78.9270μs 50.9952μs 19.6097 KOps/s 19.5911 KOps/s $\color{#35bf28}+0.09\%$
test_setitem_dim[range] 0.1026ms 72.2654μs 13.8379 KOps/s 13.5369 KOps/s $\color{#35bf28}+2.22\%$
test_setitem_dim[tuple] 98.9750μs 42.7163μs 23.4103 KOps/s 24.5872 KOps/s $\color{#d91a1a}-4.79\%$
test_setitem 0.3827ms 21.6429μs 46.2046 KOps/s 49.9020 KOps/s $\textbf{\color{#d91a1a}-7.41\%}$
test_set 96.5700μs 21.1053μs 47.3814 KOps/s 51.0842 KOps/s $\textbf{\color{#d91a1a}-7.25\%}$
test_set_shared 4.2304ms 0.1727ms 5.7896 KOps/s 5.9137 KOps/s $\color{#d91a1a}-2.10\%$
test_update 0.2169ms 24.2084μs 41.3080 KOps/s 44.0005 KOps/s $\textbf{\color{#d91a1a}-6.12\%}$
test_update_nested 0.1073ms 35.4992μs 28.1697 KOps/s 30.4417 KOps/s $\textbf{\color{#d91a1a}-7.46\%}$
test_update__nested 1.0745ms 34.4493μs 29.0282 KOps/s 29.7765 KOps/s $\color{#d91a1a}-2.51\%$
test_set_nested 0.3200ms 23.1418μs 43.2118 KOps/s 45.5489 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_set_nested_new 85.9610μs 27.6008μs 36.2309 KOps/s 37.6710 KOps/s $\color{#d91a1a}-3.82\%$
test_select 0.3328ms 44.1061μs 22.6726 KOps/s 23.3261 KOps/s $\color{#d91a1a}-2.80\%$
test_select_nested 0.1283ms 63.8051μs 15.6727 KOps/s 15.6102 KOps/s $\color{#35bf28}+0.40\%$
test_exclude_nested 0.3463ms 83.2085μs 12.0180 KOps/s 12.1365 KOps/s $\color{#d91a1a}-0.98\%$
test_empty[True] 0.7439ms 0.4163ms 2.4019 KOps/s 2.4162 KOps/s $\color{#d91a1a}-0.59\%$
test_empty[False] 11.0380μs 1.4351μs 696.8110 KOps/s 701.8006 KOps/s $\color{#d91a1a}-0.71\%$
test_unbind_speed 0.3617ms 0.2694ms 3.7119 KOps/s 3.6245 KOps/s $\color{#35bf28}+2.41\%$
test_unbind_speed_stack0 0.4865ms 0.2612ms 3.8279 KOps/s 3.8275 KOps/s $\color{#35bf28}+0.01\%$
test_unbind_speed_stack1 0.1199s 0.8060ms 1.2407 KOps/s 1.3726 KOps/s $\textbf{\color{#d91a1a}-9.61\%}$
test_split 2.5180ms 1.6100ms 621.1226 Ops/s 565.7713 Ops/s $\textbf{\color{#35bf28}+9.78\%}$
test_chunk 0.1230s 2.0165ms 495.9179 Ops/s 563.6976 Ops/s $\textbf{\color{#d91a1a}-12.02\%}$
test_consolidate_njt[False-None] 10.2302ms 8.3290ms 120.0619 Ops/s 124.9070 Ops/s $\color{#d91a1a}-3.88\%$
test_creation[device0] 5.0040ms 94.1620μs 10.6200 KOps/s 10.9339 KOps/s $\color{#d91a1a}-2.87\%$
test_creation_from_tensor 0.2812ms 93.7184μs 10.6703 KOps/s 10.4540 KOps/s $\color{#35bf28}+2.07\%$
test_add_one[memmap_tensor0] 0.1127ms 4.6963μs 212.9324 KOps/s 205.0923 KOps/s $\color{#35bf28}+3.82\%$
test_contiguous[memmap_tensor0] 11.9830μs 0.5276μs 1.8952 MOps/s 1.9617 MOps/s $\color{#d91a1a}-3.39\%$
test_stack[memmap_tensor0] 38.3210μs 3.3374μs 299.6364 KOps/s 288.9800 KOps/s $\color{#35bf28}+3.69\%$
test_memmaptd_index 1.0492ms 0.2431ms 4.1132 KOps/s 4.2178 KOps/s $\color{#d91a1a}-2.48\%$
test_memmaptd_index_astensor 0.6360ms 0.3309ms 3.0220 KOps/s 3.0505 KOps/s $\color{#d91a1a}-0.93\%$
test_memmaptd_index_op 1.0498ms 0.6115ms 1.6353 KOps/s 1.7009 KOps/s $\color{#d91a1a}-3.86\%$
test_serialize_model 0.1257s 0.1185s 8.4354 Ops/s 8.2753 Ops/s $\color{#35bf28}+1.93\%$
test_serialize_model_pickle 0.5000s 0.3990s 2.5064 Ops/s 2.5863 Ops/s $\color{#d91a1a}-3.09\%$
test_serialize_weights 0.1238s 0.1155s 8.6552 Ops/s 7.5740 Ops/s $\textbf{\color{#35bf28}+14.27\%}$
test_serialize_weights_returnearly 0.2576s 0.1814s 5.5112 Ops/s 6.2349 Ops/s $\textbf{\color{#d91a1a}-11.61\%}$
test_serialize_weights_pickle 0.4608s 0.4103s 2.4375 Ops/s 2.4533 Ops/s $\color{#d91a1a}-0.64\%$
test_serialize_weights_filesystem 0.1565s 0.1459s 6.8544 Ops/s 6.9204 Ops/s $\color{#d91a1a}-0.95\%$
test_serialize_model_filesystem 0.1547s 0.1516s 6.5980 Ops/s 5.9262 Ops/s $\textbf{\color{#35bf28}+11.34\%}$
test_reshape_pytree 80.0600μs 26.8810μs 37.2009 KOps/s 36.2051 KOps/s $\color{#35bf28}+2.75\%$
test_reshape_td 70.0110μs 33.2153μs 30.1067 KOps/s 28.5437 KOps/s $\textbf{\color{#35bf28}+5.48\%}$
test_view_pytree 81.1410μs 26.9371μs 37.1236 KOps/s 37.3162 KOps/s $\color{#d91a1a}-0.52\%$
test_view_td 94.7770μs 38.9042μs 25.7042 KOps/s 25.5433 KOps/s $\color{#35bf28}+0.63\%$
test_unbind_pytree 69.5500μs 29.8166μs 33.5384 KOps/s 33.2958 KOps/s $\color{#35bf28}+0.73\%$
test_unbind_td 0.3321ms 39.9589μs 25.0257 KOps/s 25.0127 KOps/s $\color{#35bf28}+0.05\%$
test_split_pytree 70.1210μs 29.6357μs 33.7431 KOps/s 33.5612 KOps/s $\color{#35bf28}+0.54\%$
test_split_td 0.5586ms 45.6737μs 21.8945 KOps/s 22.2475 KOps/s $\color{#d91a1a}-1.59\%$
test_add_pytree 88.3250μs 35.5839μs 28.1026 KOps/s 27.5323 KOps/s $\color{#35bf28}+2.07\%$
test_add_td 0.1196ms 60.0852μs 16.6430 KOps/s 17.5472 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_compile_add_one_nested[tensordict-compile] 0.1426ms 64.2531μs 15.5635 KOps/s 15.7214 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_add_one_nested[tensordict-eager] 0.4368ms 0.1744ms 5.7339 KOps/s 5.7167 KOps/s $\color{#35bf28}+0.30\%$
test_compile_add_one_nested[pytree-compile] 0.1434ms 46.7603μs 21.3857 KOps/s 21.9558 KOps/s $\color{#d91a1a}-2.60\%$
test_compile_add_one_nested[pytree-eager] 0.2781ms 0.1204ms 8.3034 KOps/s 8.4614 KOps/s $\color{#d91a1a}-1.87\%$
test_compile_copy_nested[tensordict-compile] 64.2200μs 27.2907μs 36.6425 KOps/s 39.2591 KOps/s $\textbf{\color{#d91a1a}-6.66\%}$
test_compile_copy_nested[tensordict-eager] 0.1356ms 59.5854μs 16.7826 KOps/s 16.8004 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_copy_nested[pytree-compile] 0.1626ms 80.9350μs 12.3556 KOps/s 12.5812 KOps/s $\color{#d91a1a}-1.79\%$
test_compile_copy_nested[pytree-eager] 0.1417ms 68.4643μs 14.6061 KOps/s 14.6018 KOps/s $\color{#35bf28}+0.03\%$
test_compile_add_one_flat[tensordict-compile] 0.2183ms 0.1046ms 9.5618 KOps/s 9.4235 KOps/s $\color{#35bf28}+1.47\%$
test_compile_add_one_flat[tensordict-eager] 1.5740ms 0.2173ms 4.6012 KOps/s 4.5810 KOps/s $\color{#35bf28}+0.44\%$
test_compile_add_one_flat[tensorclass-compile] 96.5500μs 44.1732μs 22.6382 KOps/s 22.3041 KOps/s $\color{#35bf28}+1.50\%$
test_compile_add_one_flat[tensorclass-eager] 1.2644ms 64.2104μs 15.5738 KOps/s 14.9467 KOps/s $\color{#35bf28}+4.20\%$
test_compile_add_one_flat[pytree-compile] 0.2204ms 0.1036ms 9.6523 KOps/s 9.6658 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_flat[pytree-eager] 0.3444ms 0.2027ms 4.9345 KOps/s 4.9722 KOps/s $\color{#d91a1a}-0.76\%$
test_compile_add_self_flat[tensordict-eager] 0.3473ms 0.2344ms 4.2662 KOps/s 4.2749 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_add_self_flat[tensordict-compile] 0.2026ms 0.1056ms 9.4700 KOps/s 9.4296 KOps/s $\color{#35bf28}+0.43\%$
test_compile_add_self_flat[tensorclass-eager] 0.1286ms 60.3663μs 16.5655 KOps/s 16.7302 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_add_self_flat[tensorclass-compile] 0.7890ms 46.8321μs 21.3529 KOps/s 21.4853 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_add_self_flat[pytree-eager] 0.6518ms 0.1602ms 6.2414 KOps/s 6.2867 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_add_self_flat[pytree-compile] 0.2003ms 0.1036ms 9.6527 KOps/s 9.5018 KOps/s $\color{#35bf28}+1.59\%$
test_compile_copy_flat[tensordict-compile] 83.0050μs 20.8815μs 47.8893 KOps/s 45.1711 KOps/s $\textbf{\color{#35bf28}+6.02\%}$
test_compile_copy_flat[tensordict-eager] 0.1393ms 65.8479μs 15.1865 KOps/s 14.9804 KOps/s $\color{#35bf28}+1.38\%$
test_compile_copy_flat[pytree-compile] 0.1624ms 82.8753μs 12.0663 KOps/s 12.0121 KOps/s $\color{#35bf28}+0.45\%$
test_compile_copy_flat[pytree-eager] 0.1369ms 67.8303μs 14.7427 KOps/s 14.4171 KOps/s $\color{#35bf28}+2.26\%$
test_compile_assign_and_add[tensordict-compile] 0.3842ms 0.2062ms 4.8496 KOps/s 4.6882 KOps/s $\color{#35bf28}+3.44\%$
test_compile_assign_and_add[tensordict-eager] 1.5497ms 1.3318ms 750.8845 Ops/s 747.1302 Ops/s $\color{#35bf28}+0.50\%$
test_compile_assign_and_add[pytree-compile] 0.3930ms 0.2031ms 4.9238 KOps/s 4.7124 KOps/s $\color{#35bf28}+4.49\%$
test_compile_assign_and_add[pytree-eager] 1.3456ms 0.7702ms 1.2984 KOps/s 1.2297 KOps/s $\textbf{\color{#35bf28}+5.59\%}$
test_compile_assign_and_add_stack[compile] 0.8202ms 0.4543ms 2.2014 KOps/s 2.1259 KOps/s $\color{#35bf28}+3.55\%$
test_compile_assign_and_add_stack[eager] 4.3499ms 2.7762ms 360.2080 Ops/s 371.1989 Ops/s $\color{#d91a1a}-2.96\%$
test_compile_indexing[tensor-tensordict-compile] 0.1035ms 35.8966μs 27.8578 KOps/s 27.3310 KOps/s $\color{#35bf28}+1.93\%$
test_compile_indexing[tensor-tensordict-eager] 0.5990ms 34.3970μs 29.0723 KOps/s 29.8729 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_indexing[tensor-tensorclass-compile] 91.8020μs 29.4911μs 33.9085 KOps/s 34.9022 KOps/s $\color{#d91a1a}-2.85\%$
test_compile_indexing[tensor-tensorclass-eager] 87.2230μs 23.0849μs 43.3184 KOps/s 42.9632 KOps/s $\color{#35bf28}+0.83\%$
test_compile_indexing[tensor-pytree-compile] 0.1191ms 30.6420μs 32.6349 KOps/s 33.2025 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_indexing[tensor-pytree-eager] 0.8271ms 23.1135μs 43.2649 KOps/s 42.3135 KOps/s $\color{#35bf28}+2.25\%$
test_compile_indexing[slice-tensordict-compile] 0.1064ms 51.4405μs 19.4399 KOps/s 19.3058 KOps/s $\color{#35bf28}+0.69\%$
test_compile_indexing[slice-tensordict-eager] 0.7257ms 20.1508μs 49.6258 KOps/s 49.3500 KOps/s $\color{#35bf28}+0.56\%$
test_compile_indexing[slice-tensorclass-compile] 0.1230ms 45.1278μs 22.1593 KOps/s 22.7403 KOps/s $\color{#d91a1a}-2.56\%$
test_compile_indexing[slice-tensorclass-eager] 0.1185ms 18.8713μs 52.9905 KOps/s 52.4299 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[slice-pytree-compile] 0.1160ms 44.6978μs 22.3725 KOps/s 22.5597 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_indexing[slice-pytree-eager] 80.4100μs 18.9445μs 52.7857 KOps/s 51.8014 KOps/s $\color{#35bf28}+1.90\%$
test_compile_indexing[int-tensordict-compile] 0.1180ms 52.7969μs 18.9405 KOps/s 19.1936 KOps/s $\color{#d91a1a}-1.32\%$
test_compile_indexing[int-tensordict-eager] 1.0571ms 19.9281μs 50.1804 KOps/s 49.3842 KOps/s $\color{#35bf28}+1.61\%$
test_compile_indexing[int-tensorclass-compile] 0.1119ms 44.9126μs 22.2655 KOps/s 22.3279 KOps/s $\color{#d91a1a}-0.28\%$
test_compile_indexing[int-tensorclass-eager] 66.9950μs 18.6937μs 53.4939 KOps/s 52.5830 KOps/s $\color{#35bf28}+1.73\%$
test_compile_indexing[int-pytree-compile] 0.1107ms 44.8437μs 22.2997 KOps/s 22.5322 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_indexing[int-pytree-eager] 77.6140μs 19.0120μs 52.5984 KOps/s 52.8526 KOps/s $\color{#d91a1a}-0.48\%$
test_mod_add[eager] 0.1727ms 36.9092μs 27.0935 KOps/s 29.1360 KOps/s $\textbf{\color{#d91a1a}-7.01\%}$
test_mod_add[compile] 0.1326ms 50.5984μs 19.7635 KOps/s 21.0480 KOps/s $\textbf{\color{#d91a1a}-6.10\%}$
test_mod_add[compile-overhead] 0.1181ms 49.1353μs 20.3520 KOps/s 20.8824 KOps/s $\color{#d91a1a}-2.54\%$
test_mod_wrap[eager] 0.3957ms 0.2309ms 4.3315 KOps/s 4.5016 KOps/s $\color{#d91a1a}-3.78\%$
test_mod_wrap[compile] 0.3830ms 0.2092ms 4.7794 KOps/s 4.7927 KOps/s $\color{#d91a1a}-0.28\%$
test_mod_wrap[compile-overhead] 0.3500ms 0.2071ms 4.8287 KOps/s 4.7908 KOps/s $\color{#35bf28}+0.79\%$
test_mod_wrap_and_backward[eager] 22.6538ms 12.6512ms 79.0441 Ops/s 75.3084 Ops/s $\color{#35bf28}+4.96\%$
test_mod_wrap_and_backward[compile] 18.6268ms 13.1479ms 76.0580 Ops/s 70.4928 Ops/s $\textbf{\color{#35bf28}+7.89\%}$
test_mod_wrap_and_backward[compile-overhead] 14.0521ms 12.0034ms 83.3095 Ops/s 70.6661 Ops/s $\textbf{\color{#35bf28}+17.89\%}$
test_seq_add[eager] 0.2198ms 0.1221ms 8.1926 KOps/s 8.7553 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_seq_add[compile] 0.1623ms 63.4657μs 15.7565 KOps/s 16.2264 KOps/s $\color{#d91a1a}-2.90\%$
test_seq_add[compile-overhead] 0.4616ms 63.4982μs 15.7485 KOps/s 16.5376 KOps/s $\color{#d91a1a}-4.77\%$
test_seq_wrap[eager] 0.6745ms 0.4632ms 2.1588 KOps/s 2.0361 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_seq_wrap[compile] 0.4284ms 0.2337ms 4.2784 KOps/s 4.2606 KOps/s $\color{#35bf28}+0.42\%$
test_seq_wrap[compile-overhead] 0.3333ms 0.2302ms 4.3444 KOps/s 4.3335 KOps/s $\color{#35bf28}+0.25\%$
test_func_call_runtime[False-eager] 0.7831ms 0.5502ms 1.8176 KOps/s 1.8550 KOps/s $\color{#d91a1a}-2.01\%$
test_func_call_runtime[False-compile] 0.6202ms 0.4326ms 2.3118 KOps/s 2.3250 KOps/s $\color{#d91a1a}-0.57\%$
test_func_call_runtime[False-compile-overhead] 0.5808ms 0.4362ms 2.2924 KOps/s 2.3216 KOps/s $\color{#d91a1a}-1.26\%$
test_func_call_runtime[True-eager] 1.0796ms 0.7788ms 1.2840 KOps/s 1.3163 KOps/s $\color{#d91a1a}-2.46\%$
test_func_call_runtime[True-compile] 0.6517ms 0.4737ms 2.1110 KOps/s 2.1267 KOps/s $\color{#d91a1a}-0.73\%$
test_func_call_runtime[True-compile-overhead] 0.6506ms 0.4838ms 2.0672 KOps/s 2.1463 KOps/s $\color{#d91a1a}-3.69\%$
test_func_call_cm_runtime[False-eager] 1.1967ms 0.5697ms 1.7554 KOps/s 1.8189 KOps/s $\color{#d91a1a}-3.49\%$
test_func_call_cm_runtime[False-compile] 0.5896ms 0.4343ms 2.3023 KOps/s 2.3186 KOps/s $\color{#d91a1a}-0.70\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6089ms 0.4340ms 2.3041 KOps/s 2.3303 KOps/s $\color{#d91a1a}-1.12\%$
test_func_call_cm_runtime[True-eager] 1.3316ms 0.9342ms 1.0704 KOps/s 1.1057 KOps/s $\color{#d91a1a}-3.19\%$
test_func_call_cm_runtime[True-compile] 0.6731ms 0.4987ms 2.0050 KOps/s 2.0126 KOps/s $\color{#d91a1a}-0.38\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6317ms 0.4989ms 2.0045 KOps/s 2.0065 KOps/s $\color{#d91a1a}-0.10\%$
test_vmap_func_call_cm_runtime[eager] 3.6250ms 1.9952ms 501.2071 Ops/s 513.9526 Ops/s $\color{#d91a1a}-2.48\%$
test_vmap_func_call_cm_runtime[compile] 0.7885ms 0.5300ms 1.8867 KOps/s 1.8766 KOps/s $\color{#35bf28}+0.54\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6606ms 0.5324ms 1.8784 KOps/s 1.8944 KOps/s $\color{#d91a1a}-0.84\%$
test_distributed 0.2885ms 0.1272ms 7.8636 KOps/s 7.6279 KOps/s $\color{#35bf28}+3.09\%$
test_tdmodule 55.7940μs 28.5766μs 34.9937 KOps/s 39.7837 KOps/s $\textbf{\color{#d91a1a}-12.04\%}$
test_tdmodule_dispatch 96.5700μs 51.4664μs 19.4302 KOps/s 21.5395 KOps/s $\textbf{\color{#d91a1a}-9.79\%}$
test_tdseq 64.1900μs 31.5498μs 31.6959 KOps/s 35.7469 KOps/s $\textbf{\color{#d91a1a}-11.33\%}$
test_tdseq_dispatch 99.6860μs 58.4497μs 17.1087 KOps/s 19.0807 KOps/s $\textbf{\color{#d91a1a}-10.33\%}$
test_instantiation_functorch 2.4137ms 1.6015ms 624.4226 Ops/s 656.1869 Ops/s $\color{#d91a1a}-4.84\%$
test_exec_functorch 0.3472ms 0.1824ms 5.4838 KOps/s 5.4396 KOps/s $\color{#35bf28}+0.81\%$
test_exec_functional_call 0.3419ms 0.1782ms 5.6102 KOps/s 5.8424 KOps/s $\color{#d91a1a}-3.97\%$
test_exec_td_decorator 0.6808ms 0.2408ms 4.1525 KOps/s 4.2329 KOps/s $\color{#d91a1a}-1.90\%$
test_vmap_mlp_speed_decorator[True-True] 0.9020ms 0.6713ms 1.4896 KOps/s 1.5285 KOps/s $\color{#d91a1a}-2.55\%$
test_vmap_mlp_speed_decorator[True-False] 0.9341ms 0.6695ms 1.4935 KOps/s 1.5259 KOps/s $\color{#d91a1a}-2.12\%$
test_vmap_mlp_speed_decorator[False-True] 0.8297ms 0.5348ms 1.8699 KOps/s 1.9032 KOps/s $\color{#d91a1a}-1.75\%$
test_vmap_mlp_speed_decorator[False-False] 0.7837ms 0.5313ms 1.8820 KOps/s 1.8883 KOps/s $\color{#d91a1a}-0.33\%$
test_to_module_speed[True] 2.2334ms 1.3782ms 725.5985 Ops/s 736.4505 Ops/s $\color{#d91a1a}-1.47\%$
test_to_module_speed[False] 2.2061ms 1.3509ms 740.2588 Ops/s 745.8348 Ops/s $\color{#d91a1a}-0.75\%$
test_tc_init 0.1066ms 48.9274μs 20.4385 KOps/s 20.7925 KOps/s $\color{#d91a1a}-1.70\%$
test_tc_init_nested 0.1981ms 98.0112μs 10.2029 KOps/s 10.3746 KOps/s $\color{#d91a1a}-1.66\%$
test_tc_first_layer_tensor 30.4470μs 1.5683μs 637.6427 KOps/s 629.2118 KOps/s $\color{#35bf28}+1.34\%$
test_tc_first_layer_nontensor 35.9670μs 4.8312μs 206.9882 KOps/s 205.8341 KOps/s $\color{#35bf28}+0.56\%$
test_tc_second_layer_tensor 41.0770μs 2.9252μs 341.8550 KOps/s 337.3694 KOps/s $\color{#35bf28}+1.33\%$
test_tc_second_layer_nontensor 38.6730μs 6.1983μs 161.3351 KOps/s 159.1127 KOps/s $\color{#35bf28}+1.40\%$
test_unbind 0.2636s 16.1920ms 61.7590 Ops/s 75.0737 Ops/s $\textbf{\color{#d91a1a}-17.74\%}$
test_full_like 12.9785ms 10.8933ms 91.7992 Ops/s 72.0770 Ops/s $\textbf{\color{#35bf28}+27.36\%}$
test_zeros_like 4.2893ms 3.8999ms 256.4169 Ops/s 120.4092 Ops/s $\textbf{\color{#35bf28}+112.95\%}$
test_ones_like 4.9273ms 4.5737ms 218.6418 Ops/s 122.5138 Ops/s $\textbf{\color{#35bf28}+78.46\%}$
test_clone 7.2744ms 6.7138ms 148.9475 Ops/s 102.2662 Ops/s $\textbf{\color{#35bf28}+45.65\%}$
test_squeeze 77.3640μs 12.4782μs 80.1400 KOps/s 81.0021 KOps/s $\color{#d91a1a}-1.06\%$
test_unsqueeze 0.2093ms 97.8370μs 10.2211 KOps/s 10.4457 KOps/s $\color{#d91a1a}-2.15\%$
test_split 0.5434ms 0.2007ms 4.9830 KOps/s 5.0505 KOps/s $\color{#d91a1a}-1.34\%$
test_permute 0.4346ms 0.2116ms 4.7265 KOps/s 4.7305 KOps/s $\color{#d91a1a}-0.08\%$
test_stack 34.9112ms 28.4459ms 35.1544 Ops/s 37.1514 Ops/s $\textbf{\color{#d91a1a}-5.38\%}$
test_cat 34.3208ms 28.3540ms 35.2684 Ops/s 36.9050 Ops/s $\color{#d91a1a}-4.43\%$

Copy link

github-actions bot commented Jan 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}53$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 33.3000μs 11.3633μs 88.0024 KOps/s 75.2640 KOps/s $\textbf{\color{#35bf28}+16.93\%}$
test_plain_set_stack_nested 36.3000μs 11.4271μs 87.5111 KOps/s 74.4850 KOps/s $\textbf{\color{#35bf28}+17.49\%}$
test_plain_set_nested_inplace 46.0500μs 12.3618μs 80.8944 KOps/s 69.2856 KOps/s $\textbf{\color{#35bf28}+16.75\%}$
test_plain_set_stack_nested_inplace 60.2710μs 12.4437μs 80.3620 KOps/s 69.4986 KOps/s $\textbf{\color{#35bf28}+15.63\%}$
test_items 69.5810μs 2.8923μs 345.7470 KOps/s 339.4108 KOps/s $\color{#35bf28}+1.87\%$
test_items_nested 0.4038ms 0.3557ms 2.8110 KOps/s 2.7626 KOps/s $\color{#35bf28}+1.75\%$
test_items_nested_locked 0.4208ms 0.3577ms 2.7953 KOps/s 2.7773 KOps/s $\color{#35bf28}+0.65\%$
test_items_nested_leaf 93.0810μs 58.0691μs 17.2209 KOps/s 17.1529 KOps/s $\color{#35bf28}+0.40\%$
test_items_stack_nested 0.3942ms 0.3565ms 2.8050 KOps/s 2.7723 KOps/s $\color{#35bf28}+1.18\%$
test_items_stack_nested_leaf 84.4510μs 60.1466μs 16.6260 KOps/s 17.0248 KOps/s $\color{#d91a1a}-2.34\%$
test_items_stack_nested_locked 0.4707ms 0.3563ms 2.8064 KOps/s 2.7752 KOps/s $\color{#35bf28}+1.12\%$
test_keys 31.3000μs 3.4319μs 291.3819 KOps/s 288.1301 KOps/s $\color{#35bf28}+1.13\%$
test_keys_nested 0.1082ms 82.4754μs 12.1248 KOps/s 12.2039 KOps/s $\color{#d91a1a}-0.65\%$
test_keys_nested_locked 0.7657ms 88.2032μs 11.3375 KOps/s 11.4096 KOps/s $\color{#d91a1a}-0.63\%$
test_keys_nested_leaf 2.5998ms 74.3649μs 13.4472 KOps/s 13.8206 KOps/s $\color{#d91a1a}-2.70\%$
test_keys_stack_nested 0.1418ms 84.8440μs 11.7863 KOps/s 12.1211 KOps/s $\color{#d91a1a}-2.76\%$
test_keys_stack_nested_leaf 0.1130ms 75.6827μs 13.2131 KOps/s 13.7155 KOps/s $\color{#d91a1a}-3.66\%$
test_keys_stack_nested_locked 0.5436ms 89.7696μs 11.1396 KOps/s 11.5460 KOps/s $\color{#d91a1a}-3.52\%$
test_values 11.9352μs 0.8553μs 1.1691 MOps/s 1.1629 MOps/s $\color{#35bf28}+0.54\%$
test_values_nested 61.2700μs 34.9226μs 28.6348 KOps/s 29.1208 KOps/s $\color{#d91a1a}-1.67\%$
test_values_nested_locked 68.1510μs 36.7336μs 27.2230 KOps/s 27.9047 KOps/s $\color{#d91a1a}-2.44\%$
test_values_nested_leaf 69.9510μs 39.8347μs 25.1037 KOps/s 25.7784 KOps/s $\color{#d91a1a}-2.62\%$
test_values_stack_nested 69.2200μs 35.2317μs 28.3835 KOps/s 28.7541 KOps/s $\color{#d91a1a}-1.29\%$
test_values_stack_nested_leaf 72.4610μs 40.0579μs 24.9639 KOps/s 25.7002 KOps/s $\color{#d91a1a}-2.87\%$
test_values_stack_nested_locked 64.3210μs 37.3817μs 26.7511 KOps/s 27.4776 KOps/s $\color{#d91a1a}-2.64\%$
test_membership 1.9545μs 0.5119μs 1.9535 MOps/s 1.9644 MOps/s $\color{#d91a1a}-0.55\%$
test_membership_nested 20.5200μs 2.0240μs 494.0764 KOps/s 505.2648 KOps/s $\color{#d91a1a}-2.21\%$
test_membership_nested_leaf 17.5300μs 1.9932μs 501.7107 KOps/s 497.7890 KOps/s $\color{#35bf28}+0.79\%$
test_membership_stacked_nested 38.8200μs 2.0578μs 485.9600 KOps/s 482.6920 KOps/s $\color{#35bf28}+0.68\%$
test_membership_stacked_nested_leaf 22.5600μs 2.0438μs 489.2810 KOps/s 489.1490 KOps/s $\color{#35bf28}+0.03\%$
test_membership_nested_last 24.8700μs 3.0927μs 323.3447 KOps/s 326.5184 KOps/s $\color{#d91a1a}-0.97\%$
test_membership_nested_leaf_last 26.0600μs 3.0735μs 325.3637 KOps/s 324.7407 KOps/s $\color{#35bf28}+0.19\%$
test_membership_stacked_nested_last 20.7300μs 3.1379μs 318.6830 KOps/s 121.8870 KOps/s $\textbf{\color{#35bf28}+161.46\%}$
test_membership_stacked_nested_leaf_last 32.7200μs 3.1052μs 322.0454 KOps/s 122.6076 KOps/s $\textbf{\color{#35bf28}+162.66\%}$
test_nested_getleaf 28.2600μs 6.2487μs 160.0340 KOps/s 163.2573 KOps/s $\color{#d91a1a}-1.97\%$
test_nested_get 43.8110μs 5.8754μs 170.2014 KOps/s 174.9593 KOps/s $\color{#d91a1a}-2.72\%$
test_stacked_getleaf 32.2500μs 6.1961μs 161.3929 KOps/s 163.4724 KOps/s $\color{#d91a1a}-1.27\%$
test_stacked_get 27.4200μs 5.8458μs 171.0618 KOps/s 172.3630 KOps/s $\color{#d91a1a}-0.75\%$
test_nested_getitemleaf 38.9600μs 6.3244μs 158.1174 KOps/s 162.1917 KOps/s $\color{#d91a1a}-2.51\%$
test_nested_getitem 33.3200μs 6.0331μs 165.7526 KOps/s 170.7811 KOps/s $\color{#d91a1a}-2.94\%$
test_stacked_getitemleaf 34.7700μs 6.2547μs 159.8797 KOps/s 161.1184 KOps/s $\color{#d91a1a}-0.77\%$
test_stacked_getitem 34.5110μs 5.9036μs 169.3875 KOps/s 169.1588 KOps/s $\color{#35bf28}+0.14\%$
test_lock_nested 2.4426ms 0.3744ms 2.6708 KOps/s 2.6265 KOps/s $\color{#35bf28}+1.69\%$
test_lock_stack_nested 0.3865ms 0.3460ms 2.8904 KOps/s 2.8763 KOps/s $\color{#35bf28}+0.49\%$
test_unlock_nested 0.6434ms 0.3144ms 3.1807 KOps/s 3.0935 KOps/s $\color{#35bf28}+2.82\%$
test_unlock_stack_nested 0.3164ms 0.2840ms 3.5210 KOps/s 3.5029 KOps/s $\color{#35bf28}+0.52\%$
test_flatten_speed 0.1193ms 74.5185μs 13.4195 KOps/s 13.3378 KOps/s $\color{#35bf28}+0.61\%$
test_unflatten_speed 0.3752ms 0.3242ms 3.0845 KOps/s 3.1624 KOps/s $\color{#d91a1a}-2.46\%$
test_common_ops 1.5498ms 0.5747ms 1.7399 KOps/s 1.5066 KOps/s $\textbf{\color{#35bf28}+15.49\%}$
test_creation 93.4310μs 1.7777μs 562.5091 KOps/s 577.8049 KOps/s $\color{#d91a1a}-2.65\%$
test_creation_empty 30.0500μs 6.5056μs 153.7142 KOps/s 96.6963 KOps/s $\textbf{\color{#35bf28}+58.97\%}$
test_creation_nested_1 1.6219ms 8.0845μs 123.6931 KOps/s 83.3374 KOps/s $\textbf{\color{#35bf28}+48.42\%}$
test_creation_nested_2 37.5200μs 10.7792μs 92.7716 KOps/s 67.6008 KOps/s $\textbf{\color{#35bf28}+37.23\%}$
test_clone 49.3410μs 10.8569μs 92.1075 KOps/s 87.0055 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_getitem[int] 1.2210ms 10.5905μs 94.4241 KOps/s 90.4984 KOps/s $\color{#35bf28}+4.34\%$
test_getitem[slice_int] 0.1393ms 20.2009μs 49.5028 KOps/s 46.2026 KOps/s $\textbf{\color{#35bf28}+7.14\%}$
test_getitem[range] 0.1248ms 37.5313μs 26.6444 KOps/s 25.4080 KOps/s $\color{#35bf28}+4.87\%$
test_getitem[tuple] 0.1059ms 18.3246μs 54.5716 KOps/s 52.9074 KOps/s $\color{#35bf28}+3.15\%$
test_getitem[list] 0.1271ms 33.6948μs 29.6782 KOps/s 28.2754 KOps/s $\color{#35bf28}+4.96\%$
test_setitem_dim[int] 47.2800μs 18.9065μs 52.8919 KOps/s 49.5011 KOps/s $\textbf{\color{#35bf28}+6.85\%}$
test_setitem_dim[slice_int] 75.0710μs 38.8742μs 25.7240 KOps/s 24.9798 KOps/s $\color{#35bf28}+2.98\%$
test_setitem_dim[range] 94.1010μs 53.5166μs 18.6858 KOps/s 17.9599 KOps/s $\color{#35bf28}+4.04\%$
test_setitem_dim[tuple] 54.4010μs 32.5005μs 30.7688 KOps/s 29.5378 KOps/s $\color{#35bf28}+4.17\%$
test_setitem 0.1240ms 14.4368μs 69.2676 KOps/s 58.8040 KOps/s $\textbf{\color{#35bf28}+17.79\%}$
test_set 0.1132ms 13.8783μs 72.0549 KOps/s 60.3603 KOps/s $\textbf{\color{#35bf28}+19.37\%}$
test_set_shared 1.9054ms 0.1548ms 6.4612 KOps/s 6.3546 KOps/s $\color{#35bf28}+1.68\%$
test_update 0.3465ms 15.8113μs 63.2459 KOps/s 48.8949 KOps/s $\textbf{\color{#35bf28}+29.35\%}$
test_update_nested 0.1222ms 20.8555μs 47.9491 KOps/s 38.6339 KOps/s $\textbf{\color{#35bf28}+24.11\%}$
test_update__nested 0.7770ms 26.1346μs 38.2635 KOps/s 38.7751 KOps/s $\color{#d91a1a}-1.32\%$
test_set_nested 35.6000μs 15.1060μs 66.1987 KOps/s 56.4731 KOps/s $\textbf{\color{#35bf28}+17.22\%}$
test_set_nested_new 0.1284ms 17.3557μs 57.6181 KOps/s 49.3256 KOps/s $\textbf{\color{#35bf28}+16.81\%}$
test_select 66.2010μs 30.3694μs 32.9278 KOps/s 31.5370 KOps/s $\color{#35bf28}+4.41\%$
test_select_nested 83.1610μs 43.2232μs 23.1357 KOps/s 22.8815 KOps/s $\color{#35bf28}+1.11\%$
test_exclude_nested 89.8810μs 63.0404μs 15.8628 KOps/s 15.8868 KOps/s $\color{#d91a1a}-0.15\%$
test_empty[True] 0.3589ms 0.2903ms 3.4452 KOps/s 3.4728 KOps/s $\color{#d91a1a}-0.79\%$
test_empty[False] 2.9371μs 0.8244μs 1.2130 MOps/s 1.2208 MOps/s $\color{#d91a1a}-0.64\%$
test_to 87.3810μs 56.9377μs 17.5631 KOps/s 16.9568 KOps/s $\color{#35bf28}+3.58\%$
test_to_nonblocking 85.4710μs 48.5608μs 20.5927 KOps/s 18.9389 KOps/s $\textbf{\color{#35bf28}+8.73\%}$
test_unbind_speed 0.2657ms 0.2368ms 4.2230 KOps/s 4.0137 KOps/s $\textbf{\color{#35bf28}+5.22\%}$
test_unbind_speed_stack0 0.3079ms 0.2383ms 4.1971 KOps/s 4.1340 KOps/s $\color{#35bf28}+1.53\%$
test_unbind_speed_stack1 93.5319ms 0.6707ms 1.4910 KOps/s 1.4864 KOps/s $\color{#35bf28}+0.31\%$
test_split 93.3797ms 1.5800ms 632.8983 Ops/s 613.6553 Ops/s $\color{#35bf28}+3.14\%$
test_chunk 95.7540ms 1.5894ms 629.1717 Ops/s 606.6456 Ops/s $\color{#35bf28}+3.71\%$
test_consolidate[False-None] 96.6021ms 2.9189ms 342.5956 Ops/s 334.9664 Ops/s $\color{#35bf28}+2.28\%$
test_consolidate[default-None] 1.7719ms 1.7056ms 586.3149 Ops/s 580.0271 Ops/s $\color{#35bf28}+1.08\%$
test_consolidate[reduce-overhead-None] 1.8357ms 1.7412ms 574.3030 Ops/s 563.1216 Ops/s $\color{#35bf28}+1.99\%$
test_consolidate_njt[False-None] 6.7064ms 6.6069ms 151.3561 Ops/s 111.5554 Ops/s $\textbf{\color{#35bf28}+35.68\%}$
test_to[False-False-None] 1.8859ms 1.8059ms 553.7429 Ops/s 551.2749 Ops/s $\color{#35bf28}+0.45\%$
test_to[True-False-None] 1.5901ms 1.3481ms 741.7658 Ops/s 710.7992 Ops/s $\color{#35bf28}+4.36\%$
test_to[within-False-None] 0.2953s 5.3516ms 186.8599 Ops/s 232.4479 Ops/s $\textbf{\color{#d91a1a}-19.61\%}$
test_to[True-default-None] 5.5471ms 5.1986ms 192.3579 Ops/s 182.0097 Ops/s $\textbf{\color{#35bf28}+5.69\%}$
test_to_njt[False-False-None] 7.1218ms 6.9936ms 142.9883 Ops/s 138.9756 Ops/s $\color{#35bf28}+2.89\%$
test_to_njt[True-False-None] 5.8725ms 5.5501ms 180.1757 Ops/s 177.5313 Ops/s $\color{#35bf28}+1.49\%$
test_to_njt[within-False-None] 12.4188ms 12.2335ms 81.7424 Ops/s 80.1717 Ops/s $\color{#35bf28}+1.96\%$
test_creation[device0] 0.5232ms 81.7207μs 12.2368 KOps/s 11.9559 KOps/s $\color{#35bf28}+2.35\%$
test_creation_from_tensor 0.4404ms 85.4019μs 11.7093 KOps/s 11.2744 KOps/s $\color{#35bf28}+3.86\%$
test_add_one[memmap_tensor0] 0.4873ms 6.8549μs 145.8813 KOps/s 134.9652 KOps/s $\textbf{\color{#35bf28}+8.09\%}$
test_contiguous[memmap_tensor0] 2.1985μs 0.4285μs 2.3335 MOps/s 2.3119 MOps/s $\color{#35bf28}+0.93\%$
test_stack[memmap_tensor0] 42.6310μs 4.3582μs 229.4508 KOps/s 208.7174 KOps/s $\textbf{\color{#35bf28}+9.93\%}$
test_memmaptd_index 2.0679ms 0.2488ms 4.0193 KOps/s 3.7105 KOps/s $\textbf{\color{#35bf28}+8.32\%}$
test_memmaptd_index_astensor 0.9868ms 0.3100ms 3.2254 KOps/s 3.0104 KOps/s $\textbf{\color{#35bf28}+7.14\%}$
test_memmaptd_index_op 0.9565ms 0.5604ms 1.7845 KOps/s 1.5146 KOps/s $\textbf{\color{#35bf28}+17.82\%}$
test_serialize_model 0.1325s 0.1311s 7.6260 Ops/s 7.6341 Ops/s $\color{#d91a1a}-0.11\%$
test_serialize_model_pickle 1.3495s 1.2137s 0.8239 Ops/s 0.8429 Ops/s $\color{#d91a1a}-2.25\%$
test_serialize_weights 0.2808s 0.1524s 6.5609 Ops/s 7.6969 Ops/s $\textbf{\color{#d91a1a}-14.76\%}$
test_serialize_weights_returnearly 0.3389s 53.6858ms 18.6269 Ops/s 14.7953 Ops/s $\textbf{\color{#35bf28}+25.90\%}$
test_serialize_weights_pickle 1.3752s 1.2153s 0.8229 Ops/s 0.8225 Ops/s $\color{#35bf28}+0.04\%$
test_reshape_pytree 0.1219ms 21.8178μs 45.8342 KOps/s 43.6144 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_reshape_td 64.5910μs 26.0131μs 38.4422 KOps/s 36.7051 KOps/s $\color{#35bf28}+4.73\%$
test_view_pytree 0.1122ms 21.9516μs 45.5548 KOps/s 44.6841 KOps/s $\color{#35bf28}+1.95\%$
test_view_td 0.1163ms 28.8557μs 34.6552 KOps/s 31.7181 KOps/s $\textbf{\color{#35bf28}+9.26\%}$
test_unbind_pytree 0.4437ms 27.6531μs 36.1624 KOps/s 35.2456 KOps/s $\color{#35bf28}+2.60\%$
test_unbind_td 0.6386ms 35.5544μs 28.1259 KOps/s 26.5844 KOps/s $\textbf{\color{#35bf28}+5.80\%}$
test_split_pytree 0.4420ms 30.0063μs 33.3264 KOps/s 33.3365 KOps/s $\color{#d91a1a}-0.03\%$
test_split_td 0.7421ms 37.8426μs 26.4253 KOps/s 25.5673 KOps/s $\color{#35bf28}+3.36\%$
test_add_pytree 0.4388ms 34.7528μs 28.7747 KOps/s 27.5847 KOps/s $\color{#35bf28}+4.31\%$
test_add_td 84.1710μs 45.4959μs 21.9800 KOps/s 19.0663 KOps/s $\textbf{\color{#35bf28}+15.28\%}$
test_compile_add_one_nested[tensordict-compile] 0.1723ms 0.1206ms 8.2938 KOps/s 7.9156 KOps/s $\color{#35bf28}+4.78\%$
test_compile_add_one_nested[tensordict-eager] 0.2258ms 0.1309ms 7.6419 KOps/s 7.5984 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_one_nested[pytree-compile] 0.1386ms 97.2854μs 10.2790 KOps/s 9.8688 KOps/s $\color{#35bf28}+4.16\%$
test_compile_add_one_nested[pytree-eager] 1.1223ms 0.1537ms 6.5052 KOps/s 6.3234 KOps/s $\color{#35bf28}+2.88\%$
test_compile_copy_nested[tensordict-compile] 57.6210μs 23.0207μs 43.4392 KOps/s 48.0920 KOps/s $\textbf{\color{#d91a1a}-9.67\%}$
test_compile_copy_nested[tensordict-eager] 51.2700μs 30.1830μs 33.1312 KOps/s 33.9257 KOps/s $\color{#d91a1a}-2.34\%$
test_compile_copy_nested[pytree-compile] 0.1024ms 65.3298μs 15.3069 KOps/s 15.0453 KOps/s $\color{#35bf28}+1.74\%$
test_compile_copy_nested[pytree-eager] 0.1042ms 49.4037μs 20.2414 KOps/s 20.1536 KOps/s $\color{#35bf28}+0.44\%$
test_compile_add_one_flat[tensordict-compile] 0.1819ms 0.1426ms 7.0126 KOps/s 7.0200 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_add_one_flat[tensordict-eager] 0.3037ms 0.2152ms 4.6468 KOps/s 4.6835 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_add_one_flat[tensorclass-compile] 0.1356ms 97.9850μs 10.2056 KOps/s 10.2050 KOps/s $+0.01\%$
test_compile_add_one_flat[tensorclass-eager] 0.1239ms 54.1739μs 18.4591 KOps/s 18.5488 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_add_one_flat[pytree-compile] 0.1777ms 0.1365ms 7.3245 KOps/s 7.3294 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_add_one_flat[pytree-eager] 0.5395ms 0.4995ms 2.0019 KOps/s 1.9241 KOps/s $\color{#35bf28}+4.04\%$
test_compile_add_self_flat[tensordict-eager] 0.3619ms 0.2594ms 3.8554 KOps/s 3.9016 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_add_self_flat[tensordict-compile] 0.2474ms 0.1440ms 6.9428 KOps/s 7.0489 KOps/s $\color{#d91a1a}-1.51\%$
test_compile_add_self_flat[tensorclass-eager] 0.1436ms 65.6194μs 15.2394 KOps/s 14.9583 KOps/s $\color{#35bf28}+1.88\%$
test_compile_add_self_flat[tensorclass-compile] 0.1437ms 99.3156μs 10.0689 KOps/s 9.9036 KOps/s $\color{#35bf28}+1.67\%$
test_compile_add_self_flat[pytree-eager] 0.4685ms 0.4239ms 2.3591 KOps/s 2.3235 KOps/s $\color{#35bf28}+1.53\%$
test_compile_add_self_flat[pytree-compile] 0.1810ms 0.1360ms 7.3536 KOps/s 7.1820 KOps/s $\color{#35bf28}+2.39\%$
test_compile_copy_flat[tensordict-compile] 55.1510μs 18.9072μs 52.8899 KOps/s 57.9880 KOps/s $\textbf{\color{#d91a1a}-8.79\%}$
test_compile_copy_flat[tensordict-eager] 0.1326ms 31.5912μs 31.6544 KOps/s 32.2905 KOps/s $\color{#d91a1a}-1.97\%$
test_compile_copy_flat[pytree-compile] 0.1135ms 71.2771μs 14.0298 KOps/s 13.4762 KOps/s $\color{#35bf28}+4.11\%$
test_compile_copy_flat[pytree-eager] 83.5210μs 51.9374μs 19.2539 KOps/s 18.5292 KOps/s $\color{#35bf28}+3.91\%$
test_compile_assign_and_add[tensordict-compile] 1.6498ms 0.3964ms 2.5224 KOps/s 2.1547 KOps/s $\textbf{\color{#35bf28}+17.06\%}$
test_compile_assign_and_add[tensordict-eager] 3.1896ms 2.6615ms 375.7277 Ops/s 363.4511 Ops/s $\color{#35bf28}+3.38\%$
test_compile_assign_and_add[pytree-compile] 1.6046ms 0.3826ms 2.6135 KOps/s 2.2367 KOps/s $\textbf{\color{#35bf28}+16.84\%}$
test_compile_assign_and_add[pytree-eager] 3.0125ms 2.8286ms 353.5370 Ops/s 351.3375 Ops/s $\color{#35bf28}+0.63\%$
test_compile_indexing[tensor-tensordict-compile] 0.6055ms 0.1174ms 8.5153 KOps/s 8.3757 KOps/s $\color{#35bf28}+1.67\%$
test_compile_indexing[tensor-tensordict-eager] 0.5653ms 86.4384μs 11.5689 KOps/s 12.1567 KOps/s $\color{#d91a1a}-4.84\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5417ms 0.1114ms 8.9768 KOps/s 8.9203 KOps/s $\color{#35bf28}+0.63\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1202ms 74.1720μs 13.4822 KOps/s 13.7022 KOps/s $\color{#d91a1a}-1.61\%$
test_compile_indexing[tensor-pytree-compile] 0.2091ms 0.1109ms 9.0168 KOps/s 8.8771 KOps/s $\color{#35bf28}+1.57\%$
test_compile_indexing[tensor-pytree-eager] 0.1138ms 74.8437μs 13.3612 KOps/s 13.8049 KOps/s $\color{#d91a1a}-3.21\%$
test_compile_indexing[slice-tensordict-compile] 0.1529ms 0.1027ms 9.7328 KOps/s 9.5405 KOps/s $\color{#35bf28}+2.02\%$
test_compile_indexing[slice-tensordict-eager] 0.1417ms 17.3951μs 57.4873 KOps/s 55.4503 KOps/s $\color{#35bf28}+3.67\%$
test_compile_indexing[slice-tensorclass-compile] 0.1766ms 0.1011ms 9.8925 KOps/s 10.3635 KOps/s $\color{#d91a1a}-4.54\%$
test_compile_indexing[slice-tensorclass-eager] 59.4000μs 15.8555μs 63.0696 KOps/s 59.9364 KOps/s $\textbf{\color{#35bf28}+5.23\%}$
test_compile_indexing[slice-pytree-compile] 0.1460ms 0.1018ms 9.8258 KOps/s 10.0912 KOps/s $\color{#d91a1a}-2.63\%$
test_compile_indexing[slice-pytree-eager] 51.7110μs 15.7281μs 63.5805 KOps/s 60.0246 KOps/s $\textbf{\color{#35bf28}+5.92\%}$
test_compile_indexing[int-tensordict-compile] 0.2051ms 0.1045ms 9.5712 KOps/s 9.4284 KOps/s $\color{#35bf28}+1.52\%$
test_compile_indexing[int-tensordict-eager] 0.5633ms 18.0110μs 55.5217 KOps/s 55.3820 KOps/s $\color{#35bf28}+0.25\%$
test_compile_indexing[int-tensorclass-compile] 0.1581ms 0.1028ms 9.7281 KOps/s 9.9338 KOps/s $\color{#d91a1a}-2.07\%$
test_compile_indexing[int-tensorclass-eager] 0.1896ms 16.3276μs 61.2460 KOps/s 59.9578 KOps/s $\color{#35bf28}+2.15\%$
test_compile_indexing[int-pytree-compile] 0.1918ms 0.1005ms 9.9466 KOps/s 9.9787 KOps/s $\color{#d91a1a}-0.32\%$
test_compile_indexing[int-pytree-eager] 56.4110μs 16.1640μs 61.8660 KOps/s 59.9465 KOps/s $\color{#35bf28}+3.20\%$
test_mod_add[eager] 80.4010μs 37.2490μs 26.8464 KOps/s 24.1423 KOps/s $\textbf{\color{#35bf28}+11.20\%}$
test_mod_add[compile] 0.4622ms 83.5305μs 11.9717 KOps/s 12.2972 KOps/s $\color{#d91a1a}-2.65\%$
test_mod_add[compile-overhead] 0.3249ms 0.1672ms 5.9793 KOps/s 5.5945 KOps/s $\textbf{\color{#35bf28}+6.88\%}$
test_mod_wrap[eager] 0.3363ms 0.2519ms 3.9694 KOps/s 3.7490 KOps/s $\textbf{\color{#35bf28}+5.88\%}$
test_mod_wrap[compile] 0.3752ms 0.2859ms 3.4977 KOps/s 3.3339 KOps/s $\color{#35bf28}+4.91\%$
test_mod_wrap[compile-overhead] 7.0833ms 3.7547ms 266.3333 Ops/s 275.3940 Ops/s $\color{#d91a1a}-3.29\%$
test_mod_wrap_and_backward[eager] 1.5621ms 1.4827ms 674.4364 Ops/s 669.9767 Ops/s $\color{#35bf28}+0.67\%$
test_mod_wrap_and_backward[compile] 1.4751ms 1.3771ms 726.1586 Ops/s 711.0268 Ops/s $\color{#35bf28}+2.13\%$
test_mod_wrap_and_backward[compile-overhead] 1.5444ms 1.0450ms 956.9312 Ops/s 929.7506 Ops/s $\color{#35bf28}+2.92\%$
test_seq_add[eager] 0.2040ms 0.1120ms 8.9272 KOps/s 8.3379 KOps/s $\textbf{\color{#35bf28}+7.07\%}$
test_seq_add[compile] 0.1312ms 87.9928μs 11.3646 KOps/s 11.1677 KOps/s $\color{#35bf28}+1.76\%$
test_seq_add[compile-overhead] 0.1735ms 0.1299ms 7.6965 KOps/s 7.4135 KOps/s $\color{#35bf28}+3.82\%$
test_seq_wrap[eager] 0.4929ms 0.4271ms 2.3413 KOps/s 2.1906 KOps/s $\textbf{\color{#35bf28}+6.88\%}$
test_seq_wrap[compile] 0.3544ms 0.2973ms 3.3635 KOps/s 3.1730 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_seq_wrap[compile-overhead] 0.2839ms 0.2258ms 4.4296 KOps/s 4.3585 KOps/s $\color{#35bf28}+1.63\%$
test_func_call_runtime[False-eager] 0.8656ms 0.7942ms 1.2592 KOps/s 1.2673 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_runtime[False-compile] 0.8166ms 0.7428ms 1.3463 KOps/s 1.2708 KOps/s $\textbf{\color{#35bf28}+5.94\%}$
test_func_call_runtime[False-compile-overhead] 0.4178ms 0.3670ms 2.7246 KOps/s 2.6363 KOps/s $\color{#35bf28}+3.35\%$
test_func_call_runtime[True-eager] 0.9977ms 0.9190ms 1.0882 KOps/s 1.0404 KOps/s $\color{#35bf28}+4.59\%$
test_func_call_runtime[True-compile] 0.8270ms 0.7671ms 1.3037 KOps/s 1.2424 KOps/s $\color{#35bf28}+4.93\%$
test_func_call_runtime[True-compile-overhead] 0.4478ms 0.3871ms 2.5833 KOps/s 2.5614 KOps/s $\color{#35bf28}+0.85\%$
test_func_call_cm_runtime[False-eager] 0.8275ms 0.7434ms 1.3452 KOps/s 1.2283 KOps/s $\textbf{\color{#35bf28}+9.51\%}$
test_func_call_cm_runtime[False-compile] 0.8321ms 0.7572ms 1.3206 KOps/s 1.2900 KOps/s $\color{#35bf28}+2.37\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4375ms 0.3709ms 2.6958 KOps/s 2.6795 KOps/s $\color{#35bf28}+0.61\%$
test_func_call_cm_runtime[True-eager] 1.1275ms 1.0275ms 973.2252 Ops/s 958.6806 Ops/s $\color{#35bf28}+1.52\%$
test_func_call_cm_runtime[True-compile] 0.9649ms 0.7932ms 1.2607 KOps/s 1.2152 KOps/s $\color{#35bf28}+3.75\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4628ms 0.4151ms 2.4092 KOps/s 2.3772 KOps/s $\color{#35bf28}+1.35\%$
test_vmap_func_call_cm_runtime[eager] 2.6723ms 2.1192ms 471.8802 Ops/s 468.5931 Ops/s $\color{#35bf28}+0.70\%$
test_vmap_func_call_cm_runtime[compile] 0.8911ms 0.8090ms 1.2361 KOps/s 1.1819 KOps/s $\color{#35bf28}+4.58\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4757ms 0.4170ms 2.3981 KOps/s 2.3623 KOps/s $\color{#35bf28}+1.51\%$
test_distributed 2.0049ms 0.2785ms 3.5908 KOps/s 8.5094 KOps/s $\textbf{\color{#d91a1a}-57.80\%}$
test_tdmodule 52.8800μs 19.2489μs 51.9510 KOps/s 44.8333 KOps/s $\textbf{\color{#35bf28}+15.88\%}$
test_tdmodule_dispatch 62.8510μs 33.4948μs 29.8554 KOps/s 26.0740 KOps/s $\textbf{\color{#35bf28}+14.50\%}$
test_tdseq 45.2500μs 19.5065μs 51.2649 KOps/s 45.1645 KOps/s $\textbf{\color{#35bf28}+13.51\%}$
test_tdseq_dispatch 64.8310μs 35.7324μs 27.9858 KOps/s 24.0186 KOps/s $\textbf{\color{#35bf28}+16.52\%}$
test_instantiation_functorch 1.7137ms 1.5887ms 629.4357 Ops/s 627.3630 Ops/s $\color{#35bf28}+0.33\%$
test_exec_functorch 0.2171ms 0.1471ms 6.7983 KOps/s 6.5941 KOps/s $\color{#35bf28}+3.10\%$
test_exec_functional_call 0.2045ms 0.1438ms 6.9535 KOps/s 6.9005 KOps/s $\color{#35bf28}+0.77\%$
test_exec_td_decorator 0.3799ms 0.1932ms 5.1759 KOps/s 5.1091 KOps/s $\color{#35bf28}+1.31\%$
test_vmap_mlp_speed_decorator[True-True] 0.8591ms 0.7015ms 1.4255 KOps/s 1.4188 KOps/s $\color{#35bf28}+0.47\%$
test_vmap_mlp_speed_decorator[True-False] 0.8381ms 0.6964ms 1.4359 KOps/s 1.4196 KOps/s $\color{#35bf28}+1.14\%$
test_vmap_mlp_speed_decorator[False-True] 0.7284ms 0.6066ms 1.6486 KOps/s 1.5888 KOps/s $\color{#35bf28}+3.76\%$
test_vmap_mlp_speed_decorator[False-False] 0.7471ms 0.6165ms 1.6219 KOps/s 1.5672 KOps/s $\color{#35bf28}+3.49\%$
test_vmap_transformer_speed_decorator[True-True] 20.6705ms 19.6115ms 50.9905 Ops/s 49.4686 Ops/s $\color{#35bf28}+3.08\%$
test_vmap_transformer_speed_decorator[True-False] 20.3658ms 20.0226ms 49.9434 Ops/s 50.8279 Ops/s $\color{#d91a1a}-1.74\%$
test_vmap_transformer_speed_decorator[False-True] 19.4800ms 19.3785ms 51.6036 Ops/s 50.7653 Ops/s $\color{#35bf28}+1.65\%$
test_vmap_transformer_speed_decorator[False-False] 20.0117ms 19.3992ms 51.5486 Ops/s 51.0217 Ops/s $\color{#35bf28}+1.03\%$
test_to_module_speed[True] 1.1536ms 0.9804ms 1.0200 KOps/s 1.0217 KOps/s $\color{#d91a1a}-0.16\%$
test_to_module_speed[False] 1.5215ms 0.9492ms 1.0536 KOps/s 1.0595 KOps/s $\color{#d91a1a}-0.56\%$
test_tc_init 0.1303ms 33.8136μs 29.5739 KOps/s 26.2003 KOps/s $\textbf{\color{#35bf28}+12.88\%}$
test_tc_init_nested 0.1132ms 68.6564μs 14.5653 KOps/s 13.1768 KOps/s $\textbf{\color{#35bf28}+10.54\%}$
test_tc_first_layer_tensor 22.7500μs 0.8167μs 1.2244 MOps/s 1.2185 MOps/s $\color{#35bf28}+0.48\%$
test_tc_first_layer_nontensor 25.5700μs 2.2716μs 440.2111 KOps/s 436.9428 KOps/s $\color{#35bf28}+0.75\%$
test_tc_second_layer_tensor 14.8500μs 1.4459μs 691.6051 KOps/s 696.3324 KOps/s $\color{#d91a1a}-0.68\%$
test_tc_second_layer_nontensor 34.2300μs 3.0144μs 331.7395 KOps/s 327.7644 KOps/s $\color{#35bf28}+1.21\%$
test_unbind 0.2367s 10.2088ms 97.9544 Ops/s 140.6716 Ops/s $\textbf{\color{#d91a1a}-30.37\%}$
test_full_like 10.1784ms 9.2097ms 108.5807 Ops/s 107.4314 Ops/s $\color{#35bf28}+1.07\%$
test_zeros_like 9.3825ms 7.2479ms 137.9706 Ops/s 229.5144 Ops/s $\textbf{\color{#d91a1a}-39.89\%}$
test_ones_like 4.9440ms 4.3224ms 231.3535 Ops/s 139.0868 Ops/s $\textbf{\color{#35bf28}+66.34\%}$
test_clone 7.2756ms 6.4374ms 155.3412 Ops/s 108.5870 Ops/s $\textbf{\color{#35bf28}+43.06\%}$
test_squeeze 95.1110μs 9.8681μs 101.3365 KOps/s 106.5290 KOps/s $\color{#d91a1a}-4.87\%$
test_unsqueeze 0.1592ms 70.8538μs 14.1136 KOps/s 13.8872 KOps/s $\color{#35bf28}+1.63\%$
test_split 0.3715ms 0.1601ms 6.2451 KOps/s 6.1799 KOps/s $\color{#35bf28}+1.05\%$
test_permute 0.3206ms 0.1764ms 5.6700 KOps/s 5.5727 KOps/s $\color{#35bf28}+1.75\%$
test_stack 51.0211ms 50.5659ms 19.7762 Ops/s 19.7059 Ops/s $\color{#35bf28}+0.36\%$
test_cat 51.0839ms 50.4134ms 19.8360 Ops/s 19.7104 Ops/s $\color{#35bf28}+0.64\%$

@vmoens vmoens added the enhancement New feature or request label Jan 7, 2025
@vmoens vmoens merged commit 95205d9 into gh/vmoens/45/base Jan 7, 2025
49 of 55 checks passed
vmoens added a commit that referenced this pull request Jan 7, 2025
ghstack-source-id: 44f0937c195d969055de10709402af7c4473df32
Pull Request resolved: #1165
@vmoens vmoens deleted the gh/vmoens/45/head branch January 7, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants