Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test] Ignore annoying jit_utils warning with device context manager #1151

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 19, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 19, 2024
vmoens added a commit that referenced this pull request Dec 19, 2024
ghstack-source-id: 23fffb80e79bb839b34178cf5e20faea7a8115c5
Pull Request resolved: #1151
@vmoens
Copy link
Contributor Author

vmoens commented Dec 19, 2024

Related to pytorch/pytorch#140702
Should be reverted once it's solved

@vmoens vmoens added the Test label Dec 19, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}29$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 41.5970μs 21.7470μs 45.9833 KOps/s 49.3358 KOps/s $\textbf{\color{#d91a1a}-6.80\%}$
test_plain_set_stack_nested 4.6409ms 21.7447μs 45.9883 KOps/s 47.9692 KOps/s $\color{#d91a1a}-4.13\%$
test_plain_set_nested_inplace 68.2130μs 22.9212μs 43.6277 KOps/s 44.4097 KOps/s $\color{#d91a1a}-1.76\%$
test_plain_set_stack_nested_inplace 57.3070μs 22.9437μs 43.5849 KOps/s 44.8790 KOps/s $\color{#d91a1a}-2.88\%$
test_items 22.9430μs 4.0687μs 245.7769 KOps/s 240.7262 KOps/s $\color{#35bf28}+2.10\%$
test_items_nested 0.5260ms 0.4045ms 2.4719 KOps/s 2.4333 KOps/s $\color{#35bf28}+1.59\%$
test_items_nested_locked 0.7373ms 0.4056ms 2.4656 KOps/s 2.4330 KOps/s $\color{#35bf28}+1.34\%$
test_items_nested_leaf 0.1457ms 75.6783μs 13.2138 KOps/s 12.9296 KOps/s $\color{#35bf28}+2.20\%$
test_items_stack_nested 0.5232ms 0.4049ms 2.4695 KOps/s 2.4348 KOps/s $\color{#35bf28}+1.43\%$
test_items_stack_nested_leaf 0.1570ms 77.4890μs 12.9051 KOps/s 12.3234 KOps/s $\color{#35bf28}+4.72\%$
test_items_stack_nested_locked 0.7439ms 0.4112ms 2.4320 KOps/s 2.4219 KOps/s $\color{#35bf28}+0.42\%$
test_keys 31.9890μs 3.4901μs 286.5268 KOps/s 283.0070 KOps/s $\color{#35bf28}+1.24\%$
test_keys_nested 0.2965ms 0.1668ms 5.9969 KOps/s 5.8582 KOps/s $\color{#35bf28}+2.37\%$
test_keys_nested_locked 1.8242ms 0.1726ms 5.7931 KOps/s 5.6630 KOps/s $\color{#35bf28}+2.30\%$
test_keys_nested_leaf 0.2332ms 0.1459ms 6.8517 KOps/s 6.8014 KOps/s $\color{#35bf28}+0.74\%$
test_keys_stack_nested 0.2918ms 0.1657ms 6.0335 KOps/s 6.0886 KOps/s $\color{#d91a1a}-0.90\%$
test_keys_stack_nested_leaf 0.2371ms 0.1444ms 6.9245 KOps/s 7.0605 KOps/s $\color{#d91a1a}-1.93\%$
test_keys_stack_nested_locked 0.2980ms 0.1707ms 5.8587 KOps/s 5.8275 KOps/s $\color{#35bf28}+0.54\%$
test_values 6.2736μs 1.0540μs 948.7240 KOps/s 955.9533 KOps/s $\color{#d91a1a}-0.76\%$
test_values_nested 0.1178ms 61.8713μs 16.1626 KOps/s 15.9795 KOps/s $\color{#35bf28}+1.15\%$
test_values_nested_locked 0.1326ms 61.8378μs 16.1713 KOps/s 15.9153 KOps/s $\color{#35bf28}+1.61\%$
test_values_nested_leaf 0.1267ms 71.8962μs 13.9089 KOps/s 13.7147 KOps/s $\color{#35bf28}+1.42\%$
test_values_stack_nested 0.1193ms 63.1705μs 15.8302 KOps/s 15.5174 KOps/s $\color{#35bf28}+2.02\%$
test_values_stack_nested_leaf 0.1265ms 72.2279μs 13.8451 KOps/s 12.7473 KOps/s $\textbf{\color{#35bf28}+8.61\%}$
test_values_stack_nested_locked 0.1226ms 63.1870μs 15.8260 KOps/s 15.4352 KOps/s $\color{#35bf28}+2.53\%$
test_membership 27.9420μs 0.8701μs 1.1493 MOps/s 1.1005 MOps/s $\color{#35bf28}+4.44\%$
test_membership_nested 31.1780μs 2.9241μs 341.9804 KOps/s 344.7587 KOps/s $\color{#d91a1a}-0.81\%$
test_membership_nested_leaf 0.1286ms 2.9652μs 337.2462 KOps/s 340.4586 KOps/s $\color{#d91a1a}-0.94\%$
test_membership_stacked_nested 28.3630μs 2.9066μs 344.0441 KOps/s 345.3784 KOps/s $\color{#d91a1a}-0.39\%$
test_membership_stacked_nested_leaf 29.1150μs 2.9449μs 339.5727 KOps/s 345.9265 KOps/s $\color{#d91a1a}-1.84\%$
test_membership_nested_last 38.0100μs 4.3053μs 232.2732 KOps/s 225.0423 KOps/s $\color{#35bf28}+3.21\%$
test_membership_nested_leaf_last 34.3040μs 4.3566μs 229.5383 KOps/s 228.1381 KOps/s $\color{#35bf28}+0.61\%$
test_membership_stacked_nested_last 26.9700μs 4.3448μs 230.1616 KOps/s 73.0938 KOps/s $\textbf{\color{#35bf28}+214.89\%}$
test_membership_stacked_nested_leaf_last 32.1500μs 4.3833μs 228.1404 KOps/s 70.1488 KOps/s $\textbf{\color{#35bf28}+225.22\%}$
test_nested_getleaf 38.6620μs 10.6488μs 93.9069 KOps/s 90.0397 KOps/s $\color{#35bf28}+4.30\%$
test_nested_get 0.1272ms 10.1302μs 98.7147 KOps/s 94.5070 KOps/s $\color{#35bf28}+4.45\%$
test_stacked_getleaf 49.6080μs 10.7184μs 93.2971 KOps/s 90.6124 KOps/s $\color{#35bf28}+2.96\%$
test_stacked_get 30.2570μs 10.0757μs 99.2490 KOps/s 94.6418 KOps/s $\color{#35bf28}+4.87\%$
test_nested_getitemleaf 39.8340μs 11.0660μs 90.3668 KOps/s 86.4077 KOps/s $\color{#35bf28}+4.58\%$
test_nested_getitem 56.6790μs 10.4125μs 96.0384 KOps/s 91.7080 KOps/s $\color{#35bf28}+4.72\%$
test_stacked_getitemleaf 44.9740μs 10.9967μs 90.9367 KOps/s 86.2375 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_stacked_getitem 54.6780μs 10.2122μs 97.9222 KOps/s 92.8844 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_lock_nested 4.2838ms 0.4643ms 2.1536 KOps/s 2.1801 KOps/s $\color{#d91a1a}-1.22\%$
test_lock_stack_nested 0.6841ms 0.4304ms 2.3237 KOps/s 2.3772 KOps/s $\color{#d91a1a}-2.25\%$
test_unlock_nested 0.7386ms 0.3797ms 2.6336 KOps/s 2.6086 KOps/s $\color{#35bf28}+0.96\%$
test_unlock_stack_nested 0.7074ms 0.3510ms 2.8491 KOps/s 2.9607 KOps/s $\color{#d91a1a}-3.77\%$
test_flatten_speed 0.1850ms 97.7135μs 10.2340 KOps/s 9.8871 KOps/s $\color{#35bf28}+3.51\%$
test_unflatten_speed 0.6484ms 0.5303ms 1.8856 KOps/s 1.8384 KOps/s $\color{#35bf28}+2.57\%$
test_common_ops 4.6736ms 0.8447ms 1.1839 KOps/s 1.2968 KOps/s $\textbf{\color{#d91a1a}-8.71\%}$
test_creation 75.8810μs 2.4803μs 403.1712 KOps/s 393.3119 KOps/s $\color{#35bf28}+2.51\%$
test_creation_empty 40.6460μs 12.9861μs 77.0052 KOps/s 98.2671 KOps/s $\textbf{\color{#d91a1a}-21.64\%}$
test_creation_nested_1 77.4670μs 15.9727μs 62.6069 KOps/s 75.7998 KOps/s $\textbf{\color{#d91a1a}-17.40\%}$
test_creation_nested_2 46.5570μs 20.7408μs 48.2142 KOps/s 56.6368 KOps/s $\textbf{\color{#d91a1a}-14.87\%}$
test_clone 70.7410μs 13.8230μs 72.3432 KOps/s 74.3109 KOps/s $\color{#d91a1a}-2.65\%$
test_getitem[int] 1.1541ms 12.8449μs 77.8519 KOps/s 79.0046 KOps/s $\color{#d91a1a}-1.46\%$
test_getitem[slice_int] 0.1774ms 24.7135μs 40.4638 KOps/s 41.1742 KOps/s $\color{#d91a1a}-1.73\%$
test_getitem[range] 0.1804ms 49.2586μs 20.3010 KOps/s 20.7719 KOps/s $\color{#d91a1a}-2.27\%$
test_getitem[tuple] 0.1294ms 20.7886μs 48.1032 KOps/s 49.5017 KOps/s $\color{#d91a1a}-2.83\%$
test_getitem[list] 0.2284ms 45.5714μs 21.9436 KOps/s 23.0003 KOps/s $\color{#d91a1a}-4.59\%$
test_setitem_dim[int] 62.5670μs 26.8192μs 37.2867 KOps/s 39.5242 KOps/s $\textbf{\color{#d91a1a}-5.66\%}$
test_setitem_dim[slice_int] 0.1066ms 53.9560μs 18.5336 KOps/s 18.8680 KOps/s $\color{#d91a1a}-1.77\%$
test_setitem_dim[range] 0.1442ms 76.1011μs 13.1404 KOps/s 13.7689 KOps/s $\color{#d91a1a}-4.56\%$
test_setitem_dim[tuple] 93.8750μs 42.8396μs 23.3429 KOps/s 24.4279 KOps/s $\color{#d91a1a}-4.44\%$
test_setitem 89.3670μs 22.4255μs 44.5922 KOps/s 50.0809 KOps/s $\textbf{\color{#d91a1a}-10.96\%}$
test_set 75.7010μs 21.2917μs 46.9666 KOps/s 51.7914 KOps/s $\textbf{\color{#d91a1a}-9.32\%}$
test_set_shared 8.4156ms 0.1730ms 5.7809 KOps/s 5.9376 KOps/s $\color{#d91a1a}-2.64\%$
test_update 0.1823ms 24.9694μs 40.0491 KOps/s 46.4628 KOps/s $\textbf{\color{#d91a1a}-13.80\%}$
test_update_nested 0.1007ms 35.6550μs 28.0465 KOps/s 31.3728 KOps/s $\textbf{\color{#d91a1a}-10.60\%}$
test_update__nested 0.7472ms 34.4311μs 29.0435 KOps/s 29.5224 KOps/s $\color{#d91a1a}-1.62\%$
test_set_nested 92.8250μs 23.4340μs 42.6731 KOps/s 47.0798 KOps/s $\textbf{\color{#d91a1a}-9.36\%}$
test_set_nested_new 75.9710μs 28.7675μs 34.7615 KOps/s 37.6763 KOps/s $\textbf{\color{#d91a1a}-7.74\%}$
test_select 93.4140μs 45.6083μs 21.9258 KOps/s 23.1113 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_select_nested 0.1305ms 63.1994μs 15.8229 KOps/s 15.5906 KOps/s $\color{#35bf28}+1.49\%$
test_exclude_nested 0.2208ms 82.8188μs 12.0746 KOps/s 12.0778 KOps/s $\color{#d91a1a}-0.03\%$
test_empty[True] 0.6000ms 0.4151ms 2.4091 KOps/s 2.3964 KOps/s $\color{#35bf28}+0.53\%$
test_empty[False] 8.0575μs 1.4420μs 693.4592 KOps/s 697.4482 KOps/s $\color{#d91a1a}-0.57\%$
test_unbind_speed 0.4656ms 0.2734ms 3.6571 KOps/s 3.6673 KOps/s $\color{#d91a1a}-0.28\%$
test_unbind_speed_stack0 0.4217ms 0.2676ms 3.7369 KOps/s 3.8112 KOps/s $\color{#d91a1a}-1.95\%$
test_unbind_speed_stack1 97.1384ms 0.8096ms 1.2351 KOps/s 1.4157 KOps/s $\textbf{\color{#d91a1a}-12.75\%}$
test_split 1.7288ms 1.5853ms 630.7899 Ops/s 572.6109 Ops/s $\textbf{\color{#35bf28}+10.16\%}$
test_chunk 98.5135ms 1.8888ms 529.4426 Ops/s 575.7366 Ops/s $\textbf{\color{#d91a1a}-8.04\%}$
test_consolidate_njt[False-None] 8.9673ms 8.3095ms 120.3438 Ops/s 124.2010 Ops/s $\color{#d91a1a}-3.11\%$
test_creation[device0] 0.1915ms 91.7002μs 10.9051 KOps/s 10.8691 KOps/s $\color{#35bf28}+0.33\%$
test_creation_from_tensor 4.1956ms 94.9001μs 10.5374 KOps/s 10.7589 KOps/s $\color{#d91a1a}-2.06\%$
test_add_one[memmap_tensor0] 0.2351ms 4.7440μs 210.7923 KOps/s 210.6993 KOps/s $\color{#35bf28}+0.04\%$
test_contiguous[memmap_tensor0] 22.1910μs 0.5071μs 1.9720 MOps/s 1.9780 MOps/s $\color{#d91a1a}-0.30\%$
test_stack[memmap_tensor0] 33.6630μs 3.3498μs 298.5278 KOps/s 301.6800 KOps/s $\color{#d91a1a}-1.04\%$
test_memmaptd_index 1.0622ms 0.2382ms 4.1979 KOps/s 4.2410 KOps/s $\color{#d91a1a}-1.02\%$
test_memmaptd_index_astensor 0.5672ms 0.3238ms 3.0883 KOps/s 3.1052 KOps/s $\color{#d91a1a}-0.54\%$
test_memmaptd_index_op 0.9721ms 0.6105ms 1.6381 KOps/s 1.7984 KOps/s $\textbf{\color{#d91a1a}-8.92\%}$
test_serialize_model 0.1286s 0.1164s 8.5901 Ops/s 8.7640 Ops/s $\color{#d91a1a}-1.98\%$
test_serialize_model_pickle 0.4744s 0.4005s 2.4969 Ops/s 2.5612 Ops/s $\color{#d91a1a}-2.51\%$
test_serialize_weights 0.1200s 0.1139s 8.7822 Ops/s 8.6907 Ops/s $\color{#35bf28}+1.05\%$
test_serialize_weights_returnearly 0.1573s 0.1521s 6.5757 Ops/s 6.3197 Ops/s $\color{#35bf28}+4.05\%$
test_serialize_weights_pickle 0.5625s 0.4386s 2.2800 Ops/s 2.5144 Ops/s $\textbf{\color{#d91a1a}-9.32\%}$
test_serialize_weights_filesystem 0.1489s 0.1406s 7.1135 Ops/s 7.1045 Ops/s $\color{#35bf28}+0.13\%$
test_serialize_model_filesystem 0.1557s 0.1461s 6.8444 Ops/s 6.0879 Ops/s $\textbf{\color{#35bf28}+12.43\%}$
test_reshape_pytree 69.9900μs 26.8356μs 37.2640 KOps/s 36.8696 KOps/s $\color{#35bf28}+1.07\%$
test_reshape_td 82.7520μs 33.1571μs 30.1594 KOps/s 29.4193 KOps/s $\color{#35bf28}+2.52\%$
test_view_pytree 73.5070μs 27.2003μs 36.7643 KOps/s 36.9201 KOps/s $\color{#d91a1a}-0.42\%$
test_view_td 76.7430μs 38.2417μs 26.1494 KOps/s 25.7155 KOps/s $\color{#35bf28}+1.69\%$
test_unbind_pytree 79.7380μs 29.9792μs 33.3564 KOps/s 33.2002 KOps/s $\color{#35bf28}+0.47\%$
test_unbind_td 0.3470ms 40.3923μs 24.7572 KOps/s 25.0284 KOps/s $\color{#d91a1a}-1.08\%$
test_split_pytree 84.7780μs 29.3540μs 34.0669 KOps/s 33.6493 KOps/s $\color{#35bf28}+1.24\%$
test_split_td 0.5560ms 45.4157μs 22.0188 KOps/s 22.0926 KOps/s $\color{#d91a1a}-0.33\%$
test_add_pytree 80.4600μs 35.3320μs 28.3030 KOps/s 27.7032 KOps/s $\color{#35bf28}+2.16\%$
test_add_td 0.1241ms 58.3415μs 17.1404 KOps/s 18.4896 KOps/s $\textbf{\color{#d91a1a}-7.30\%}$
test_compile_add_one_nested[tensordict-compile] 0.1403ms 62.0120μs 16.1259 KOps/s 16.0938 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_one_nested[tensordict-eager] 0.3702ms 0.1718ms 5.8194 KOps/s 5.7939 KOps/s $\color{#35bf28}+0.44\%$
test_compile_add_one_nested[pytree-compile] 0.1342ms 45.9010μs 21.7860 KOps/s 21.8290 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_add_one_nested[pytree-eager] 0.2315ms 0.1178ms 8.4909 KOps/s 8.3880 KOps/s $\color{#35bf28}+1.23\%$
test_compile_copy_nested[tensordict-compile] 79.1280μs 25.9635μs 38.5155 KOps/s 38.8322 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_copy_nested[tensordict-eager] 0.1123ms 58.4864μs 17.0980 KOps/s 17.3255 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_copy_nested[pytree-compile] 0.1670ms 79.4012μs 12.5943 KOps/s 12.5547 KOps/s $\color{#35bf28}+0.32\%$
test_compile_copy_nested[pytree-eager] 0.1277ms 67.9290μs 14.7213 KOps/s 14.6078 KOps/s $\color{#35bf28}+0.78\%$
test_compile_add_one_flat[tensordict-compile] 0.1832ms 0.1056ms 9.4667 KOps/s 9.5171 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_add_one_flat[tensordict-eager] 0.4222ms 0.2160ms 4.6289 KOps/s 4.5952 KOps/s $\color{#35bf28}+0.73\%$
test_compile_add_one_flat[tensorclass-compile] 0.1227ms 45.6804μs 21.8912 KOps/s 22.9252 KOps/s $\color{#d91a1a}-4.51\%$
test_compile_add_one_flat[tensorclass-eager] 0.4897ms 64.4507μs 15.5157 KOps/s 15.5719 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_add_one_flat[pytree-compile] 0.2036ms 0.1039ms 9.6272 KOps/s 9.7005 KOps/s $\color{#d91a1a}-0.76\%$
test_compile_add_one_flat[pytree-eager] 0.3039ms 0.2008ms 4.9799 KOps/s 4.9818 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_add_self_flat[tensordict-eager] 0.3974ms 0.2334ms 4.2849 KOps/s 4.2578 KOps/s $\color{#35bf28}+0.64\%$
test_compile_add_self_flat[tensordict-compile] 0.1993ms 0.1067ms 9.3724 KOps/s 9.5863 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_add_self_flat[tensorclass-eager] 0.1503ms 58.9313μs 16.9689 KOps/s 17.2560 KOps/s $\color{#d91a1a}-1.66\%$
test_compile_add_self_flat[tensorclass-compile] 0.1105ms 45.4935μs 21.9812 KOps/s 22.7276 KOps/s $\color{#d91a1a}-3.28\%$
test_compile_add_self_flat[pytree-eager] 0.6336ms 0.1606ms 6.2252 KOps/s 6.2908 KOps/s $\color{#d91a1a}-1.04\%$
test_compile_add_self_flat[pytree-compile] 0.1956ms 0.1042ms 9.5993 KOps/s 9.6937 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_copy_flat[tensordict-compile] 77.5250μs 21.4927μs 46.5274 KOps/s 48.6225 KOps/s $\color{#d91a1a}-4.31\%$
test_compile_copy_flat[tensordict-eager] 0.1314ms 66.1498μs 15.1172 KOps/s 15.2418 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_copy_flat[pytree-compile] 0.1730ms 79.9260μs 12.5116 KOps/s 12.0794 KOps/s $\color{#35bf28}+3.58\%$
test_compile_copy_flat[pytree-eager] 0.1325ms 69.0176μs 14.4891 KOps/s 14.1104 KOps/s $\color{#35bf28}+2.68\%$
test_compile_assign_and_add[tensordict-compile] 0.3562ms 0.2050ms 4.8781 KOps/s 4.9179 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_assign_and_add[tensordict-eager] 2.1032ms 1.3208ms 757.1250 Ops/s 759.7762 Ops/s $\color{#d91a1a}-0.35\%$
test_compile_assign_and_add[pytree-compile] 0.3658ms 0.2082ms 4.8022 KOps/s 4.9783 KOps/s $\color{#d91a1a}-3.54\%$
test_compile_assign_and_add[pytree-eager] 0.9620ms 0.7700ms 1.2987 KOps/s 1.3103 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_assign_and_add_stack[compile] 0.5588ms 0.4622ms 2.1637 KOps/s 2.2265 KOps/s $\color{#d91a1a}-2.82\%$
test_compile_assign_and_add_stack[eager] 3.6345ms 2.8065ms 356.3140 Ops/s 384.7812 Ops/s $\textbf{\color{#d91a1a}-7.40\%}$
test_compile_indexing[tensor-tensordict-compile] 97.5820μs 36.8498μs 27.1372 KOps/s 28.3399 KOps/s $\color{#d91a1a}-4.24\%$
test_compile_indexing[tensor-tensordict-eager] 0.5285ms 33.7374μs 29.6407 KOps/s 30.6294 KOps/s $\color{#d91a1a}-3.23\%$
test_compile_indexing[tensor-tensorclass-compile] 97.2110μs 29.8850μs 33.4617 KOps/s 34.3566 KOps/s $\color{#d91a1a}-2.60\%$
test_compile_indexing[tensor-tensorclass-eager] 80.2290μs 23.7314μs 42.1383 KOps/s 42.6760 KOps/s $\color{#d91a1a}-1.26\%$
test_compile_indexing[tensor-pytree-compile] 99.9660μs 30.2855μs 33.0191 KOps/s 33.2093 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_indexing[tensor-pytree-eager] 93.8140μs 23.5817μs 42.4058 KOps/s 42.1886 KOps/s $\color{#35bf28}+0.51\%$
test_compile_indexing[slice-tensordict-compile] 0.1274ms 51.8186μs 19.2981 KOps/s 19.2589 KOps/s $\color{#35bf28}+0.20\%$
test_compile_indexing[slice-tensordict-eager] 0.6016ms 20.5431μs 48.6781 KOps/s 49.4641 KOps/s $\color{#d91a1a}-1.59\%$
test_compile_indexing[slice-tensorclass-compile] 0.1242ms 44.0264μs 22.7137 KOps/s 22.2046 KOps/s $\color{#35bf28}+2.29\%$
test_compile_indexing[slice-tensorclass-eager] 71.3020μs 19.1403μs 52.2456 KOps/s 53.4525 KOps/s $\color{#d91a1a}-2.26\%$
test_compile_indexing[slice-pytree-compile] 92.3830μs 44.8594μs 22.2919 KOps/s 21.8273 KOps/s $\color{#35bf28}+2.13\%$
test_compile_indexing[slice-pytree-eager] 68.9180μs 18.8595μs 53.0236 KOps/s 53.0241 KOps/s $-0.00\%$
test_compile_indexing[int-tensordict-compile] 0.1252ms 53.1037μs 18.8311 KOps/s 18.5604 KOps/s $\color{#35bf28}+1.46\%$
test_compile_indexing[int-tensordict-eager] 1.0213ms 20.3396μs 49.1651 KOps/s 49.4364 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_indexing[int-tensorclass-compile] 0.1045ms 44.8318μs 22.3056 KOps/s 21.8859 KOps/s $\color{#35bf28}+1.92\%$
test_compile_indexing[int-tensorclass-eager] 61.7650μs 18.8324μs 53.1001 KOps/s 52.7042 KOps/s $\color{#35bf28}+0.75\%$
test_compile_indexing[int-pytree-compile] 92.8830μs 44.9608μs 22.2416 KOps/s 21.8296 KOps/s $\color{#35bf28}+1.89\%$
test_compile_indexing[int-pytree-eager] 76.8920μs 18.9201μs 52.8539 KOps/s 52.0063 KOps/s $\color{#35bf28}+1.63\%$
test_mod_add[eager] 92.3520μs 35.2154μs 28.3967 KOps/s 30.4614 KOps/s $\textbf{\color{#d91a1a}-6.78\%}$
test_mod_add[compile] 0.1214ms 48.1313μs 20.7765 KOps/s 21.7176 KOps/s $\color{#d91a1a}-4.33\%$
test_mod_add[compile-overhead] 0.1104ms 47.9792μs 20.8424 KOps/s 21.2013 KOps/s $\color{#d91a1a}-1.69\%$
test_mod_wrap[eager] 0.4101ms 0.2302ms 4.3442 KOps/s 4.5024 KOps/s $\color{#d91a1a}-3.51\%$
test_mod_wrap[compile] 0.3410ms 0.2080ms 4.8074 KOps/s 4.8635 KOps/s $\color{#d91a1a}-1.15\%$
test_mod_wrap[compile-overhead] 0.3921ms 0.2080ms 4.8084 KOps/s 4.8933 KOps/s $\color{#d91a1a}-1.74\%$
test_mod_wrap_and_backward[eager] 16.9335ms 12.6888ms 78.8099 Ops/s 80.0070 Ops/s $\color{#d91a1a}-1.50\%$
test_mod_wrap_and_backward[compile] 14.6203ms 11.5658ms 86.4620 Ops/s 70.6521 Ops/s $\textbf{\color{#35bf28}+22.38\%}$
test_mod_wrap_and_backward[compile-overhead] 17.1836ms 12.3023ms 81.2856 Ops/s 74.2647 Ops/s $\textbf{\color{#35bf28}+9.45\%}$
test_seq_add[eager] 0.2700ms 0.1188ms 8.4145 KOps/s 8.7954 KOps/s $\color{#d91a1a}-4.33\%$
test_seq_add[compile] 0.1488ms 63.3486μs 15.7857 KOps/s 16.5674 KOps/s $\color{#d91a1a}-4.72\%$
test_seq_add[compile-overhead] 0.1367ms 61.8719μs 16.1624 KOps/s 17.0067 KOps/s $\color{#d91a1a}-4.96\%$
test_seq_wrap[eager] 0.7504ms 0.4555ms 2.1955 KOps/s 2.2819 KOps/s $\color{#d91a1a}-3.79\%$
test_seq_wrap[compile] 0.4367ms 0.2322ms 4.3065 KOps/s 4.3533 KOps/s $\color{#d91a1a}-1.07\%$
test_seq_wrap[compile-overhead] 0.7300ms 0.2297ms 4.3538 KOps/s 4.4517 KOps/s $\color{#d91a1a}-2.20\%$
test_func_call_runtime[False-eager] 0.8456ms 0.5646ms 1.7712 KOps/s 1.8203 KOps/s $\color{#d91a1a}-2.70\%$
test_func_call_runtime[False-compile] 0.8926ms 0.4329ms 2.3098 KOps/s 2.3010 KOps/s $\color{#35bf28}+0.38\%$
test_func_call_runtime[False-compile-overhead] 0.5886ms 0.4242ms 2.3572 KOps/s 2.3342 KOps/s $\color{#35bf28}+0.99\%$
test_func_call_runtime[True-eager] 1.2949ms 0.7738ms 1.2924 KOps/s 1.3179 KOps/s $\color{#d91a1a}-1.94\%$
test_func_call_runtime[True-compile] 0.5840ms 0.4671ms 2.1408 KOps/s 2.1450 KOps/s $\color{#d91a1a}-0.19\%$
test_func_call_runtime[True-compile-overhead] 0.6661ms 0.4679ms 2.1374 KOps/s 2.1658 KOps/s $\color{#d91a1a}-1.31\%$
test_func_call_cm_runtime[False-eager] 0.7993ms 0.5591ms 1.7885 KOps/s 1.8336 KOps/s $\color{#d91a1a}-2.46\%$
test_func_call_cm_runtime[False-compile] 0.5541ms 0.4251ms 2.3527 KOps/s 2.3481 KOps/s $\color{#35bf28}+0.20\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9055ms 0.4323ms 2.3133 KOps/s 2.3416 KOps/s $\color{#d91a1a}-1.21\%$
test_func_call_cm_runtime[True-eager] 1.1258ms 0.9289ms 1.0766 KOps/s 1.1054 KOps/s $\color{#d91a1a}-2.61\%$
test_func_call_cm_runtime[True-compile] 0.8004ms 0.4935ms 2.0263 KOps/s 2.0453 KOps/s $\color{#d91a1a}-0.93\%$
test_func_call_cm_runtime[True-compile-overhead] 0.7489ms 0.4934ms 2.0269 KOps/s 2.0576 KOps/s $\color{#d91a1a}-1.49\%$
test_vmap_func_call_cm_runtime[eager] 2.3317ms 1.8990ms 526.5994 Ops/s 530.2796 Ops/s $\color{#d91a1a}-0.69\%$
test_vmap_func_call_cm_runtime[compile] 1.0445ms 0.5220ms 1.9157 KOps/s 1.9630 KOps/s $\color{#d91a1a}-2.41\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8187ms 0.5247ms 1.9057 KOps/s 1.9495 KOps/s $\color{#d91a1a}-2.25\%$
test_distributed 0.2570ms 0.1255ms 7.9694 KOps/s 7.8577 KOps/s $\color{#35bf28}+1.42\%$
test_tdmodule 0.1049ms 27.5203μs 36.3368 KOps/s 39.2379 KOps/s $\textbf{\color{#d91a1a}-7.39\%}$
test_tdmodule_dispatch 83.0640μs 50.3920μs 19.8444 KOps/s 21.6353 KOps/s $\textbf{\color{#d91a1a}-8.28\%}$
test_tdseq 50.9740μs 29.5527μs 33.8379 KOps/s 35.1768 KOps/s $\color{#d91a1a}-3.81\%$
test_tdseq_dispatch 84.3070μs 55.2848μs 18.0882 KOps/s 19.3772 KOps/s $\textbf{\color{#d91a1a}-6.65\%}$
test_instantiation_functorch 2.5498ms 1.5428ms 648.1789 Ops/s 655.1086 Ops/s $\color{#d91a1a}-1.06\%$
test_exec_functorch 0.3106ms 0.1815ms 5.5098 KOps/s 5.5467 KOps/s $\color{#d91a1a}-0.66\%$
test_exec_functional_call 0.2638ms 0.1730ms 5.7787 KOps/s 5.7460 KOps/s $\color{#35bf28}+0.57\%$
test_exec_td_decorator 0.4726ms 0.2373ms 4.2140 KOps/s 4.2442 KOps/s $\color{#d91a1a}-0.71\%$
test_vmap_mlp_speed_decorator[True-True] 0.9408ms 0.6574ms 1.5212 KOps/s 1.5459 KOps/s $\color{#d91a1a}-1.60\%$
test_vmap_mlp_speed_decorator[True-False] 1.1359ms 0.6694ms 1.4939 KOps/s 1.5437 KOps/s $\color{#d91a1a}-3.23\%$
test_vmap_mlp_speed_decorator[False-True] 0.6781ms 0.5270ms 1.8974 KOps/s 1.9019 KOps/s $\color{#d91a1a}-0.24\%$
test_vmap_mlp_speed_decorator[False-False] 0.8447ms 0.5321ms 1.8793 KOps/s 1.9060 KOps/s $\color{#d91a1a}-1.40\%$
test_to_module_speed[True] 1.4495ms 1.3285ms 752.7021 Ops/s 732.1448 Ops/s $\color{#35bf28}+2.81\%$
test_to_module_speed[False] 1.7243ms 1.3083ms 764.3460 Ops/s 752.3280 Ops/s $\color{#35bf28}+1.60\%$
test_tc_init 93.4030μs 48.9561μs 20.4265 KOps/s 22.4113 KOps/s $\textbf{\color{#d91a1a}-8.86\%}$
test_tc_init_nested 0.1747ms 99.7984μs 10.0202 KOps/s 11.2732 KOps/s $\textbf{\color{#d91a1a}-11.11\%}$
test_tc_first_layer_tensor 33.4330μs 1.5259μs 655.3666 KOps/s 645.8762 KOps/s $\color{#35bf28}+1.47\%$
test_tc_first_layer_nontensor 35.2250μs 4.6462μs 215.2306 KOps/s 213.7070 KOps/s $\color{#35bf28}+0.71\%$
test_tc_second_layer_tensor 42.0980μs 2.7795μs 359.7715 KOps/s 349.7013 KOps/s $\color{#35bf28}+2.88\%$
test_tc_second_layer_nontensor 40.3680μs 5.8621μs 170.5875 KOps/s 167.2850 KOps/s $\color{#35bf28}+1.97\%$
test_unbind 0.2096s 13.4333ms 74.4419 Ops/s 71.3544 Ops/s $\color{#35bf28}+4.33\%$
test_full_like 16.3033ms 12.1090ms 82.5832 Ops/s 144.6145 Ops/s $\textbf{\color{#d91a1a}-42.89\%}$
test_zeros_like 10.7893ms 7.2519ms 137.8952 Ops/s 372.0335 Ops/s $\textbf{\color{#d91a1a}-62.93\%}$
test_ones_like 11.6935ms 7.3256ms 136.5072 Ops/s 324.1153 Ops/s $\textbf{\color{#d91a1a}-57.88\%}$
test_clone 12.6363ms 8.9418ms 111.8343 Ops/s 205.9001 Ops/s $\textbf{\color{#d91a1a}-45.69\%}$
test_squeeze 65.8530μs 12.2193μs 81.8375 KOps/s 82.5681 KOps/s $\color{#d91a1a}-0.88\%$
test_unsqueeze 0.2938ms 95.4037μs 10.4818 KOps/s 11.0218 KOps/s $\color{#d91a1a}-4.90\%$
test_split 0.3425ms 0.1967ms 5.0841 KOps/s 5.1535 KOps/s $\color{#d91a1a}-1.35\%$
test_permute 0.3910ms 0.2086ms 4.7937 KOps/s 4.9098 KOps/s $\color{#d91a1a}-2.37\%$
test_stack 26.1683ms 23.8844ms 41.8684 Ops/s 40.7609 Ops/s $\color{#35bf28}+2.72\%$
test_cat 26.2470ms 23.7567ms 42.0934 Ops/s 41.3717 Ops/s $\color{#35bf28}+1.74\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}34$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 30.3720μs 11.4976μs 86.9748 KOps/s 77.5619 KOps/s $\textbf{\color{#35bf28}+12.14\%}$
test_plain_set_stack_nested 50.7830μs 11.7497μs 85.1087 KOps/s 76.6853 KOps/s $\textbf{\color{#35bf28}+10.98\%}$
test_plain_set_nested_inplace 42.7220μs 12.5412μs 79.7373 KOps/s 72.1645 KOps/s $\textbf{\color{#35bf28}+10.49\%}$
test_plain_set_stack_nested_inplace 38.5830μs 12.6677μs 78.9411 KOps/s 71.2061 KOps/s $\textbf{\color{#35bf28}+10.86\%}$
test_items 38.7220μs 2.8966μs 345.2355 KOps/s 335.7303 KOps/s $\color{#35bf28}+2.83\%$
test_items_nested 0.4283ms 0.3607ms 2.7722 KOps/s 2.7880 KOps/s $\color{#d91a1a}-0.57\%$
test_items_nested_locked 0.5062ms 0.3631ms 2.7543 KOps/s 2.7879 KOps/s $\color{#d91a1a}-1.21\%$
test_items_nested_leaf 88.6640μs 58.5261μs 17.0864 KOps/s 17.2649 KOps/s $\color{#d91a1a}-1.03\%$
test_items_stack_nested 0.4148ms 0.3626ms 2.7580 KOps/s 2.7784 KOps/s $\color{#d91a1a}-0.74\%$
test_items_stack_nested_leaf 97.2360μs 59.4423μs 16.8230 KOps/s 16.8773 KOps/s $\color{#d91a1a}-0.32\%$
test_items_stack_nested_locked 0.5611ms 0.3687ms 2.7121 KOps/s 2.7664 KOps/s $\color{#d91a1a}-1.96\%$
test_keys 26.4210μs 3.4605μs 288.9749 KOps/s 289.2604 KOps/s $\color{#d91a1a}-0.10\%$
test_keys_nested 0.1447ms 81.1828μs 12.3179 KOps/s 12.2629 KOps/s $\color{#35bf28}+0.45\%$
test_keys_nested_locked 0.7910ms 87.7544μs 11.3954 KOps/s 11.4181 KOps/s $\color{#d91a1a}-0.20\%$
test_keys_nested_leaf 99.2550μs 72.4472μs 13.8031 KOps/s 13.8941 KOps/s $\color{#d91a1a}-0.65\%$
test_keys_stack_nested 0.1157ms 82.5313μs 12.1166 KOps/s 12.2953 KOps/s $\color{#d91a1a}-1.45\%$
test_keys_stack_nested_leaf 0.1085ms 73.9010μs 13.5316 KOps/s 13.9522 KOps/s $\color{#d91a1a}-3.01\%$
test_keys_stack_nested_locked 0.1420ms 89.0919μs 11.2244 KOps/s 11.4613 KOps/s $\color{#d91a1a}-2.07\%$
test_values 6.8437μs 0.8543μs 1.1705 MOps/s 1.1768 MOps/s $\color{#d91a1a}-0.53\%$
test_values_nested 0.4896ms 34.9933μs 28.5769 KOps/s 29.2759 KOps/s $\color{#d91a1a}-2.39\%$
test_values_nested_locked 0.2047ms 36.3347μs 27.5219 KOps/s 27.4364 KOps/s $\color{#35bf28}+0.31\%$
test_values_nested_leaf 72.1840μs 38.9918μs 25.6464 KOps/s 25.6731 KOps/s $\color{#d91a1a}-0.10\%$
test_values_stack_nested 66.5440μs 35.4898μs 28.1771 KOps/s 29.0374 KOps/s $\color{#d91a1a}-2.96\%$
test_values_stack_nested_leaf 0.1243ms 39.8866μs 25.0711 KOps/s 25.4218 KOps/s $\color{#d91a1a}-1.38\%$
test_values_stack_nested_locked 74.2040μs 37.2404μs 26.8526 KOps/s 27.4213 KOps/s $\color{#d91a1a}-2.07\%$
test_membership 1.8986μs 0.5110μs 1.9571 MOps/s 1.9304 MOps/s $\color{#35bf28}+1.38\%$
test_membership_nested 27.9710μs 2.0790μs 480.9894 KOps/s 472.4621 KOps/s $\color{#35bf28}+1.80\%$
test_membership_nested_leaf 19.4960μs 1.9925μs 501.8872 KOps/s 495.2714 KOps/s $\color{#35bf28}+1.34\%$
test_membership_stacked_nested 24.1810μs 2.1075μs 474.5059 KOps/s 478.2775 KOps/s $\color{#d91a1a}-0.79\%$
test_membership_stacked_nested_leaf 26.5210μs 2.0767μs 481.5394 KOps/s 460.3358 KOps/s $\color{#35bf28}+4.61\%$
test_membership_nested_last 31.2120μs 3.0537μs 327.4708 KOps/s 319.5361 KOps/s $\color{#35bf28}+2.48\%$
test_membership_nested_leaf_last 50.9120μs 3.1351μs 318.9720 KOps/s 315.6923 KOps/s $\color{#35bf28}+1.04\%$
test_membership_stacked_nested_last 36.0520μs 3.1012μs 322.4515 KOps/s 322.1777 KOps/s $\color{#35bf28}+0.08\%$
test_membership_stacked_nested_leaf_last 38.5930μs 3.0891μs 323.7161 KOps/s 319.1999 KOps/s $\color{#35bf28}+1.41\%$
test_nested_getleaf 42.6220μs 6.1253μs 163.2572 KOps/s 160.9186 KOps/s $\color{#35bf28}+1.45\%$
test_nested_get 53.0930μs 5.9219μs 168.8656 KOps/s 170.8144 KOps/s $\color{#d91a1a}-1.14\%$
test_stacked_getleaf 44.4820μs 6.1841μs 161.7047 KOps/s 164.0714 KOps/s $\color{#d91a1a}-1.44\%$
test_stacked_get 41.7730μs 5.9551μs 167.9223 KOps/s 171.2347 KOps/s $\color{#d91a1a}-1.93\%$
test_nested_getitemleaf 37.9920μs 6.3513μs 157.4478 KOps/s 158.6240 KOps/s $\color{#d91a1a}-0.74\%$
test_nested_getitem 39.3230μs 5.9487μs 168.1045 KOps/s 169.3525 KOps/s $\color{#d91a1a}-0.74\%$
test_stacked_getitemleaf 29.5920μs 6.3834μs 156.6557 KOps/s 160.7097 KOps/s $\color{#d91a1a}-2.52\%$
test_stacked_getitem 41.9830μs 5.9854μs 167.0744 KOps/s 169.7776 KOps/s $\color{#d91a1a}-1.59\%$
test_lock_nested 9.2316ms 0.3845ms 2.6006 KOps/s 2.5975 KOps/s $\color{#35bf28}+0.12\%$
test_lock_stack_nested 0.4037ms 0.3485ms 2.8694 KOps/s 2.8981 KOps/s $\color{#d91a1a}-0.99\%$
test_unlock_nested 0.7281ms 0.3172ms 3.1528 KOps/s 3.1885 KOps/s $\color{#d91a1a}-1.12\%$
test_unlock_stack_nested 0.4277ms 0.2858ms 3.4986 KOps/s 3.5061 KOps/s $\color{#d91a1a}-0.21\%$
test_flatten_speed 0.1156ms 75.6532μs 13.2182 KOps/s 13.4239 KOps/s $\color{#d91a1a}-1.53\%$
test_unflatten_speed 0.3629ms 0.3192ms 3.1324 KOps/s 3.1074 KOps/s $\color{#35bf28}+0.80\%$
test_common_ops 1.6786ms 0.5962ms 1.6773 KOps/s 1.5953 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_creation 0.1048ms 1.7783μs 562.3286 KOps/s 560.5398 KOps/s $\color{#35bf28}+0.32\%$
test_creation_empty 31.2820μs 7.0484μs 141.8770 KOps/s 103.5154 KOps/s $\textbf{\color{#35bf28}+37.06\%}$
test_creation_nested_1 40.1230μs 8.7146μs 114.7498 KOps/s 88.6864 KOps/s $\textbf{\color{#35bf28}+29.39\%}$
test_creation_nested_2 51.6830μs 11.6158μs 86.0894 KOps/s 71.5724 KOps/s $\textbf{\color{#35bf28}+20.28\%}$
test_clone 82.3840μs 10.3299μs 96.8066 KOps/s 97.6301 KOps/s $\color{#d91a1a}-0.84\%$
test_getitem[int] 1.5929ms 10.9537μs 91.2938 KOps/s 90.4198 KOps/s $\color{#35bf28}+0.97\%$
test_getitem[slice_int] 0.1116ms 20.9732μs 47.6798 KOps/s 47.6820 KOps/s $-0.00\%$
test_getitem[range] 0.1405ms 37.0085μs 27.0208 KOps/s 26.4250 KOps/s $\color{#35bf28}+2.25\%$
test_getitem[tuple] 0.1168ms 18.9152μs 52.8676 KOps/s 53.4613 KOps/s $\color{#d91a1a}-1.11\%$
test_getitem[list] 0.2411ms 34.5371μs 28.9544 KOps/s 29.7814 KOps/s $\color{#d91a1a}-2.78\%$
test_setitem_dim[int] 44.2820μs 19.7154μs 50.7218 KOps/s 54.3465 KOps/s $\textbf{\color{#d91a1a}-6.67\%}$
test_setitem_dim[slice_int] 64.7640μs 39.1773μs 25.5250 KOps/s 26.9014 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_setitem_dim[range] 86.5250μs 54.9208μs 18.2080 KOps/s 19.0945 KOps/s $\color{#d91a1a}-4.64\%$
test_setitem_dim[tuple] 63.0640μs 33.7139μs 29.6613 KOps/s 31.5849 KOps/s $\textbf{\color{#d91a1a}-6.09\%}$
test_setitem 90.3050μs 14.1528μs 70.6572 KOps/s 64.4543 KOps/s $\textbf{\color{#35bf28}+9.62\%}$
test_set 85.5150μs 13.7364μs 72.7995 KOps/s 67.9054 KOps/s $\textbf{\color{#35bf28}+7.21\%}$
test_set_shared 1.6669ms 0.1479ms 6.7598 KOps/s 6.7177 KOps/s $\color{#35bf28}+0.63\%$
test_update 0.3372ms 16.2100μs 61.6901 KOps/s 53.9087 KOps/s $\textbf{\color{#35bf28}+14.43\%}$
test_update_nested 0.5156ms 21.8179μs 45.8339 KOps/s 41.7768 KOps/s $\textbf{\color{#35bf28}+9.71\%}$
test_update__nested 0.1387ms 24.2786μs 41.1886 KOps/s 40.4868 KOps/s $\color{#35bf28}+1.73\%$
test_set_nested 79.6350μs 14.8065μs 67.5378 KOps/s 62.3160 KOps/s $\textbf{\color{#35bf28}+8.38\%}$
test_set_nested_new 87.5950μs 17.3907μs 57.5020 KOps/s 54.4798 KOps/s $\textbf{\color{#35bf28}+5.55\%}$
test_select 96.6750μs 29.2054μs 34.2403 KOps/s 32.2457 KOps/s $\textbf{\color{#35bf28}+6.19\%}$
test_select_nested 0.1201ms 44.4515μs 22.4964 KOps/s 22.7069 KOps/s $\color{#d91a1a}-0.93\%$
test_exclude_nested 0.1261ms 64.2092μs 15.5741 KOps/s 15.8072 KOps/s $\color{#d91a1a}-1.47\%$
test_empty[True] 0.3703ms 0.2897ms 3.4524 KOps/s 3.4342 KOps/s $\color{#35bf28}+0.53\%$
test_empty[False] 2.9481μs 0.8300μs 1.2049 MOps/s 1.2150 MOps/s $\color{#d91a1a}-0.84\%$
test_to 86.6150μs 58.6404μs 17.0531 KOps/s 17.8354 KOps/s $\color{#d91a1a}-4.39\%$
test_to_nonblocking 0.1878ms 47.9384μs 20.8601 KOps/s 19.9142 KOps/s $\color{#35bf28}+4.75\%$
test_unbind_speed 1.3251ms 0.2376ms 4.2093 KOps/s 4.2286 KOps/s $\color{#d91a1a}-0.46\%$
test_unbind_speed_stack0 0.3821ms 0.2405ms 4.1584 KOps/s 4.1993 KOps/s $\color{#d91a1a}-0.97\%$
test_unbind_speed_stack1 92.4619ms 0.6690ms 1.4949 KOps/s 1.4985 KOps/s $\color{#d91a1a}-0.24\%$
test_split 94.0568ms 1.7321ms 577.3203 Ops/s 573.6044 Ops/s $\color{#35bf28}+0.65\%$
test_chunk 1.5919ms 1.4704ms 680.0810 Ops/s 615.0972 Ops/s $\textbf{\color{#35bf28}+10.56\%}$
test_consolidate[False-None] 97.1968ms 2.9933ms 334.0743 Ops/s 367.6528 Ops/s $\textbf{\color{#d91a1a}-9.13\%}$
test_consolidate[default-None] 2.0990ms 1.6988ms 588.6606 Ops/s 586.8640 Ops/s $\color{#35bf28}+0.31\%$
test_consolidate[reduce-overhead-None] 2.1338ms 1.7310ms 577.7018 Ops/s 570.7262 Ops/s $\color{#35bf28}+1.22\%$
test_consolidate_njt[False-None] 6.9630ms 6.5893ms 151.7618 Ops/s 149.2551 Ops/s $\color{#35bf28}+1.68\%$
test_to[False-False-None] 2.0828ms 1.7061ms 586.1427 Ops/s 575.0462 Ops/s $\color{#35bf28}+1.93\%$
test_to[True-False-None] 1.7590ms 1.3488ms 741.4142 Ops/s 732.9924 Ops/s $\color{#35bf28}+1.15\%$
test_to[within-False-None] 4.5876ms 4.2072ms 237.6854 Ops/s 239.3380 Ops/s $\color{#d91a1a}-0.69\%$
test_to[True-default-None] 5.4247ms 5.2973ms 188.7757 Ops/s 184.8078 Ops/s $\color{#35bf28}+2.15\%$
test_to_njt[False-False-None] 7.1789ms 6.8947ms 145.0397 Ops/s 143.4367 Ops/s $\color{#35bf28}+1.12\%$
test_to_njt[True-False-None] 5.9397ms 5.4939ms 182.0212 Ops/s 179.1176 Ops/s $\color{#35bf28}+1.62\%$
test_to_njt[within-False-None] 12.6049ms 12.1346ms 82.4090 Ops/s 80.5680 Ops/s $\color{#35bf28}+2.28\%$
test_creation[device0] 0.4709ms 78.3530μs 12.7628 KOps/s 12.7258 KOps/s $\color{#35bf28}+0.29\%$
test_creation_from_tensor 0.6157ms 82.4220μs 12.1327 KOps/s 11.9502 KOps/s $\color{#35bf28}+1.53\%$
test_add_one[memmap_tensor0] 0.3785ms 6.3610μs 157.2084 KOps/s 157.0516 KOps/s $\color{#35bf28}+0.10\%$
test_contiguous[memmap_tensor0] 19.5596μs 0.4348μs 2.2997 MOps/s 2.3365 MOps/s $\color{#d91a1a}-1.58\%$
test_stack[memmap_tensor0] 0.1491ms 4.6312μs 215.9247 KOps/s 215.9108 KOps/s $+0.01\%$
test_memmaptd_index 1.7790ms 0.2540ms 3.9369 KOps/s 3.8846 KOps/s $\color{#35bf28}+1.35\%$
test_memmaptd_index_astensor 0.6881ms 0.3123ms 3.2020 KOps/s 3.1131 KOps/s $\color{#35bf28}+2.86\%$
test_memmaptd_index_op 0.9414ms 0.5556ms 1.8000 KOps/s 1.6418 KOps/s $\textbf{\color{#35bf28}+9.64\%}$
test_serialize_model 0.1311s 0.1305s 7.6642 Ops/s 7.6747 Ops/s $\color{#d91a1a}-0.14\%$
test_serialize_model_pickle 1.3439s 1.1911s 0.8396 Ops/s 0.8224 Ops/s $\color{#35bf28}+2.09\%$
test_serialize_weights 0.1310s 0.1301s 7.6842 Ops/s 7.7283 Ops/s $\color{#d91a1a}-0.57\%$
test_serialize_weights_returnearly 0.3096s 54.3826ms 18.3882 Ops/s 14.2735 Ops/s $\textbf{\color{#35bf28}+28.83\%}$
test_serialize_weights_pickle 1.3459s 1.2221s 0.8183 Ops/s 0.8221 Ops/s $\color{#d91a1a}-0.46\%$
test_reshape_pytree 49.7620μs 22.4301μs 44.5830 KOps/s 44.2303 KOps/s $\color{#35bf28}+0.80\%$
test_reshape_td 61.8930μs 30.1400μs 33.1784 KOps/s 34.0617 KOps/s $\color{#d91a1a}-2.59\%$
test_view_pytree 0.1406ms 22.9771μs 43.5216 KOps/s 44.4829 KOps/s $\color{#d91a1a}-2.16\%$
test_view_td 0.1374ms 33.3534μs 29.9820 KOps/s 30.6402 KOps/s $\color{#d91a1a}-2.15\%$
test_unbind_pytree 0.1263ms 28.3056μs 35.3287 KOps/s 35.2404 KOps/s $\color{#35bf28}+0.25\%$
test_unbind_td 0.6106ms 36.9885μs 27.0354 KOps/s 26.9644 KOps/s $\color{#35bf28}+0.26\%$
test_split_pytree 0.1237ms 30.4706μs 32.8186 KOps/s 32.4490 KOps/s $\color{#35bf28}+1.14\%$
test_split_td 0.7526ms 38.9561μs 25.6699 KOps/s 24.5135 KOps/s $\color{#35bf28}+4.72\%$
test_add_pytree 0.1435ms 32.6537μs 30.6244 KOps/s 29.0428 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_add_td 82.4350μs 45.5065μs 21.9749 KOps/s 17.9731 KOps/s $\textbf{\color{#35bf28}+22.27\%}$
test_compile_add_one_nested[tensordict-compile] 0.2693ms 0.1231ms 8.1225 KOps/s 7.8394 KOps/s $\color{#35bf28}+3.61\%$
test_compile_add_one_nested[tensordict-eager] 0.2613ms 0.1348ms 7.4208 KOps/s 7.4645 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_add_one_nested[pytree-compile] 0.1910ms 96.9087μs 10.3190 KOps/s 10.1050 KOps/s $\color{#35bf28}+2.12\%$
test_compile_add_one_nested[pytree-eager] 1.2963ms 0.1473ms 6.7911 KOps/s 6.5671 KOps/s $\color{#35bf28}+3.41\%$
test_compile_copy_nested[tensordict-compile] 0.1669ms 23.3652μs 42.7986 KOps/s 43.7309 KOps/s $\color{#d91a1a}-2.13\%$
test_compile_copy_nested[tensordict-eager] 0.2108ms 29.6140μs 33.7678 KOps/s 33.5932 KOps/s $\color{#35bf28}+0.52\%$
test_compile_copy_nested[pytree-compile] 0.3695ms 65.3227μs 15.3086 KOps/s 15.0804 KOps/s $\color{#35bf28}+1.51\%$
test_compile_copy_nested[pytree-eager] 0.1360ms 49.4153μs 20.2367 KOps/s 20.0518 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_one_flat[tensordict-compile] 0.1998ms 0.1415ms 7.0654 KOps/s 6.9203 KOps/s $\color{#35bf28}+2.10\%$
test_compile_add_one_flat[tensordict-eager] 0.3543ms 0.2147ms 4.6567 KOps/s 4.6210 KOps/s $\color{#35bf28}+0.77\%$
test_compile_add_one_flat[tensorclass-compile] 0.2734ms 99.1286μs 10.0879 KOps/s 10.1871 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_add_one_flat[tensorclass-eager] 0.1165ms 54.8205μs 18.2413 KOps/s 18.3460 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_add_one_flat[pytree-compile] 0.1741ms 0.1361ms 7.3460 KOps/s 7.1847 KOps/s $\color{#35bf28}+2.24\%$
test_compile_add_one_flat[pytree-eager] 0.6340ms 0.4802ms 2.0825 KOps/s 2.0271 KOps/s $\color{#35bf28}+2.73\%$
test_compile_add_self_flat[tensordict-eager] 0.3842ms 0.2585ms 3.8692 KOps/s 3.8140 KOps/s $\color{#35bf28}+1.45\%$
test_compile_add_self_flat[tensordict-compile] 0.1843ms 0.1429ms 6.9981 KOps/s 6.7805 KOps/s $\color{#35bf28}+3.21\%$
test_compile_add_self_flat[tensorclass-eager] 0.1805ms 65.8646μs 15.1827 KOps/s 15.0895 KOps/s $\color{#35bf28}+0.62\%$
test_compile_add_self_flat[tensorclass-compile] 0.1357ms 98.6994μs 10.1318 KOps/s 10.1790 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_add_self_flat[pytree-eager] 0.5746ms 0.4102ms 2.4378 KOps/s 2.3936 KOps/s $\color{#35bf28}+1.84\%$
test_compile_add_self_flat[pytree-compile] 0.2933ms 0.1360ms 7.3510 KOps/s 7.3030 KOps/s $\color{#35bf28}+0.66\%$
test_compile_copy_flat[tensordict-compile] 76.3840μs 19.3736μs 51.6166 KOps/s 53.1262 KOps/s $\color{#d91a1a}-2.84\%$
test_compile_copy_flat[tensordict-eager] 55.7630μs 31.1707μs 32.0814 KOps/s 32.2652 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_copy_flat[pytree-compile] 98.3650μs 70.4364μs 14.1972 KOps/s 14.1207 KOps/s $\color{#35bf28}+0.54\%$
test_compile_copy_flat[pytree-eager] 0.1546ms 51.2125μs 19.5265 KOps/s 19.4179 KOps/s $\color{#35bf28}+0.56\%$
test_compile_assign_and_add[tensordict-compile] 1.6987ms 0.4071ms 2.4563 KOps/s 2.1868 KOps/s $\textbf{\color{#35bf28}+12.33\%}$
test_compile_assign_and_add[tensordict-eager] 2.9940ms 2.6460ms 377.9298 Ops/s 382.7390 Ops/s $\color{#d91a1a}-1.26\%$
test_compile_assign_and_add[pytree-compile] 1.5921ms 0.4308ms 2.3215 KOps/s 2.2438 KOps/s $\color{#35bf28}+3.46\%$
test_compile_assign_and_add[pytree-eager] 2.9953ms 2.6171ms 382.1008 Ops/s 378.6867 Ops/s $\color{#35bf28}+0.90\%$
test_compile_indexing[tensor-tensordict-compile] 0.5385ms 0.1146ms 8.7289 KOps/s 8.7879 KOps/s $\color{#d91a1a}-0.67\%$
test_compile_indexing[tensor-tensordict-eager] 0.5613ms 83.2913μs 12.0061 KOps/s 12.5659 KOps/s $\color{#d91a1a}-4.45\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5207ms 0.1065ms 9.3890 KOps/s 9.4768 KOps/s $\color{#d91a1a}-0.93\%$
test_compile_indexing[tensor-tensorclass-eager] 0.4679ms 70.3109μs 14.2226 KOps/s 14.6691 KOps/s $\color{#d91a1a}-3.04\%$
test_compile_indexing[tensor-pytree-compile] 0.5238ms 0.1133ms 8.8232 KOps/s 9.3419 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_compile_indexing[tensor-pytree-eager] 0.1346ms 71.9276μs 13.9029 KOps/s 14.7294 KOps/s $\textbf{\color{#d91a1a}-5.61\%}$
test_compile_indexing[slice-tensordict-compile] 0.2467ms 0.1027ms 9.7371 KOps/s 9.8139 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_indexing[slice-tensordict-eager] 0.4174ms 17.3296μs 57.7047 KOps/s 55.6798 KOps/s $\color{#35bf28}+3.64\%$
test_compile_indexing[slice-tensorclass-compile] 0.5195ms 0.1012ms 9.8840 KOps/s 10.2393 KOps/s $\color{#d91a1a}-3.47\%$
test_compile_indexing[slice-tensorclass-eager] 0.4110ms 16.7546μs 59.6850 KOps/s 61.5907 KOps/s $\color{#d91a1a}-3.09\%$
test_compile_indexing[slice-pytree-compile] 0.5080ms 0.1030ms 9.7089 KOps/s 10.1552 KOps/s $\color{#d91a1a}-4.39\%$
test_compile_indexing[slice-pytree-eager] 41.1120μs 16.1754μs 61.8221 KOps/s 61.9993 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_indexing[int-tensordict-compile] 0.5200ms 0.1059ms 9.4443 KOps/s 9.7670 KOps/s $\color{#d91a1a}-3.30\%$
test_compile_indexing[int-tensordict-eager] 0.6192ms 17.2346μs 58.0228 KOps/s 56.0624 KOps/s $\color{#35bf28}+3.50\%$
test_compile_indexing[int-tensorclass-compile] 0.5089ms 0.1001ms 9.9879 KOps/s 10.1932 KOps/s $\color{#d91a1a}-2.01\%$
test_compile_indexing[int-tensorclass-eager] 0.4172ms 16.3005μs 61.3478 KOps/s 62.1560 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_indexing[int-pytree-compile] 0.5050ms 99.4922μs 10.0510 KOps/s 10.1833 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_indexing[int-pytree-eager] 54.6430μs 16.3072μs 61.3228 KOps/s 62.1429 KOps/s $\color{#d91a1a}-1.32\%$
test_mod_add[eager] 0.1644ms 39.9222μs 25.0487 KOps/s 25.6548 KOps/s $\color{#d91a1a}-2.36\%$
test_mod_add[compile] 0.1234ms 79.8465μs 12.5240 KOps/s 12.3436 KOps/s $\color{#35bf28}+1.46\%$
test_mod_add[compile-overhead] 0.3231ms 0.1665ms 6.0061 KOps/s 5.6751 KOps/s $\textbf{\color{#35bf28}+5.83\%}$
test_mod_wrap[eager] 0.3315ms 0.2454ms 4.0743 KOps/s 3.9537 KOps/s $\color{#35bf28}+3.05\%$
test_mod_wrap[compile] 0.3354ms 0.2829ms 3.5354 KOps/s 3.4593 KOps/s $\color{#35bf28}+2.20\%$
test_mod_wrap[compile-overhead] 7.0650ms 3.7288ms 268.1838 Ops/s 275.7490 Ops/s $\color{#d91a1a}-2.74\%$
test_mod_wrap_and_backward[eager] 1.4987ms 1.3253ms 754.5216 Ops/s 698.4028 Ops/s $\textbf{\color{#35bf28}+8.04\%}$
test_mod_wrap_and_backward[compile] 1.4045ms 1.2609ms 793.0876 Ops/s 727.9055 Ops/s $\textbf{\color{#35bf28}+8.95\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4128ms 0.9243ms 1.0819 KOps/s 905.4966 Ops/s $\textbf{\color{#35bf28}+19.48\%}$
test_seq_add[eager] 0.1720ms 0.1166ms 8.5772 KOps/s 8.2922 KOps/s $\color{#35bf28}+3.44\%$
test_seq_add[compile] 0.1843ms 88.5960μs 11.2872 KOps/s 11.2641 KOps/s $\color{#35bf28}+0.20\%$
test_seq_add[compile-overhead] 0.2068ms 0.1293ms 7.7331 KOps/s 7.6391 KOps/s $\color{#35bf28}+1.23\%$
test_seq_wrap[eager] 0.5723ms 0.4221ms 2.3692 KOps/s 2.3147 KOps/s $\color{#35bf28}+2.36\%$
test_seq_wrap[compile] 0.3921ms 0.3019ms 3.3126 KOps/s 3.2905 KOps/s $\color{#35bf28}+0.67\%$
test_seq_wrap[compile-overhead] 0.2724ms 0.2232ms 4.4803 KOps/s 4.3975 KOps/s $\color{#35bf28}+1.88\%$
test_func_call_runtime[False-eager] 0.8504ms 0.7571ms 1.3209 KOps/s 1.3515 KOps/s $\color{#d91a1a}-2.27\%$
test_func_call_runtime[False-compile] 1.0377ms 0.7475ms 1.3377 KOps/s 1.3234 KOps/s $\color{#35bf28}+1.08\%$
test_func_call_runtime[False-compile-overhead] 0.4468ms 0.3665ms 2.7287 KOps/s 2.7109 KOps/s $\color{#35bf28}+0.66\%$
test_func_call_runtime[True-eager] 0.9971ms 0.9129ms 1.0955 KOps/s 1.0882 KOps/s $\color{#35bf28}+0.67\%$
test_func_call_runtime[True-compile] 0.9129ms 0.7698ms 1.2991 KOps/s 1.2892 KOps/s $\color{#35bf28}+0.77\%$
test_func_call_runtime[True-compile-overhead] 0.4779ms 0.3847ms 2.5997 KOps/s 2.5393 KOps/s $\color{#35bf28}+2.38\%$
test_func_call_cm_runtime[False-eager] 0.8316ms 0.7388ms 1.3536 KOps/s 1.3518 KOps/s $\color{#35bf28}+0.13\%$
test_func_call_cm_runtime[False-compile] 0.8165ms 0.7504ms 1.3326 KOps/s 1.3179 KOps/s $\color{#35bf28}+1.11\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4112ms 0.3679ms 2.7180 KOps/s 2.7003 KOps/s $\color{#35bf28}+0.65\%$
test_func_call_cm_runtime[True-eager] 1.0729ms 0.9856ms 1.0146 KOps/s 998.5466 Ops/s $\color{#35bf28}+1.61\%$
test_func_call_cm_runtime[True-compile] 0.8794ms 0.8126ms 1.2307 KOps/s 1.2476 KOps/s $\color{#d91a1a}-1.36\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4633ms 0.4111ms 2.4323 KOps/s 2.3936 KOps/s $\color{#35bf28}+1.62\%$
test_vmap_func_call_cm_runtime[eager] 2.5215ms 2.0224ms 494.4653 Ops/s 480.7310 Ops/s $\color{#35bf28}+2.86\%$
test_vmap_func_call_cm_runtime[compile] 0.8642ms 0.8115ms 1.2323 KOps/s 1.2111 KOps/s $\color{#35bf28}+1.75\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4762ms 0.4132ms 2.4204 KOps/s 2.3847 KOps/s $\color{#35bf28}+1.50\%$
test_distributed 3.8330ms 0.1786ms 5.5976 KOps/s 8.8246 KOps/s $\textbf{\color{#d91a1a}-36.57\%}$
test_tdmodule 30.8320μs 19.1530μs 52.2112 KOps/s 48.2859 KOps/s $\textbf{\color{#35bf28}+8.13\%}$
test_tdmodule_dispatch 73.8640μs 34.2620μs 29.1869 KOps/s 26.5618 KOps/s $\textbf{\color{#35bf28}+9.88\%}$
test_tdseq 39.8420μs 20.0354μs 49.9117 KOps/s 45.2029 KOps/s $\textbf{\color{#35bf28}+10.42\%}$
test_tdseq_dispatch 59.4730μs 37.3754μs 26.7556 KOps/s 24.3091 KOps/s $\textbf{\color{#35bf28}+10.06\%}$
test_instantiation_functorch 1.6375ms 1.5636ms 639.5352 Ops/s 639.5175 Ops/s $+0.00\%$
test_exec_functorch 0.1960ms 0.1431ms 6.9876 KOps/s 7.0494 KOps/s $\color{#d91a1a}-0.88\%$
test_exec_functional_call 0.1714ms 0.1332ms 7.5103 KOps/s 7.5459 KOps/s $\color{#d91a1a}-0.47\%$
test_exec_td_decorator 0.3695ms 0.1830ms 5.4649 KOps/s 5.4696 KOps/s $\color{#d91a1a}-0.09\%$
test_vmap_mlp_speed_decorator[True-True] 0.7450ms 0.6695ms 1.4936 KOps/s 1.4661 KOps/s $\color{#35bf28}+1.87\%$
test_vmap_mlp_speed_decorator[True-False] 0.8200ms 0.6709ms 1.4904 KOps/s 1.4665 KOps/s $\color{#35bf28}+1.63\%$
test_vmap_mlp_speed_decorator[False-True] 0.7004ms 0.5816ms 1.7193 KOps/s 1.6310 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_vmap_mlp_speed_decorator[False-False] 0.6997ms 0.5817ms 1.7191 KOps/s 1.6184 KOps/s $\textbf{\color{#35bf28}+6.22\%}$
test_vmap_transformer_speed_decorator[True-True] 18.8503ms 18.7492ms 53.3357 Ops/s 52.9530 Ops/s $\color{#35bf28}+0.72\%$
test_vmap_transformer_speed_decorator[True-False] 18.8891ms 18.7896ms 53.2208 Ops/s 52.8809 Ops/s $\color{#35bf28}+0.64\%$
test_vmap_transformer_speed_decorator[False-True] 18.8210ms 18.6784ms 53.5379 Ops/s 53.4969 Ops/s $\color{#35bf28}+0.08\%$
test_vmap_transformer_speed_decorator[False-False] 18.7181ms 18.6502ms 53.6189 Ops/s 53.2769 Ops/s $\color{#35bf28}+0.64\%$
test_to_module_speed[True] 1.4564ms 0.9754ms 1.0252 KOps/s 1.0212 KOps/s $\color{#35bf28}+0.39\%$
test_to_module_speed[False] 1.3428ms 0.9664ms 1.0348 KOps/s 1.0430 KOps/s $\color{#d91a1a}-0.79\%$
test_tc_init 57.9230μs 35.3218μs 28.3111 KOps/s 25.4340 KOps/s $\textbf{\color{#35bf28}+11.31\%}$
test_tc_init_nested 0.1617ms 73.5688μs 13.5927 KOps/s 12.6452 KOps/s $\textbf{\color{#35bf28}+7.49\%}$
test_tc_first_layer_tensor 17.4053μs 0.7088μs 1.4108 MOps/s 1.4181 MOps/s $\color{#d91a1a}-0.52\%$
test_tc_first_layer_nontensor 33.6120μs 2.3307μs 429.0465 KOps/s 430.5308 KOps/s $\color{#d91a1a}-0.34\%$
test_tc_second_layer_tensor 9.9540μs 1.4257μs 701.4015 KOps/s 658.9653 KOps/s $\textbf{\color{#35bf28}+6.44\%}$
test_tc_second_layer_nontensor 32.7710μs 3.0819μs 324.4740 KOps/s 322.5599 KOps/s $\color{#35bf28}+0.59\%$
test_unbind 0.2312s 10.3338ms 96.7700 Ops/s 142.1190 Ops/s $\textbf{\color{#d91a1a}-31.91\%}$
test_full_like 10.7849ms 9.2811ms 107.7464 Ops/s 106.5092 Ops/s $\color{#35bf28}+1.16\%$
test_zeros_like 5.3230ms 4.3210ms 231.4281 Ops/s 230.8909 Ops/s $\color{#35bf28}+0.23\%$
test_ones_like 4.9377ms 4.3247ms 231.2276 Ops/s 231.0146 Ops/s $\color{#35bf28}+0.09\%$
test_clone 11.5469ms 9.1604ms 109.1654 Ops/s 155.0516 Ops/s $\textbf{\color{#d91a1a}-29.59\%}$
test_squeeze 60.4140μs 9.6854μs 103.2485 KOps/s 104.4934 KOps/s $\color{#d91a1a}-1.19\%$
test_unsqueeze 0.1361ms 78.1480μs 12.7962 KOps/s 13.5362 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_split 0.3890ms 0.1624ms 6.1575 KOps/s 6.1188 KOps/s $\color{#35bf28}+0.63\%$
test_permute 0.2273ms 0.1834ms 5.4532 KOps/s 5.4515 KOps/s $\color{#35bf28}+0.03\%$
test_stack 51.2543ms 50.8594ms 19.6621 Ops/s 19.6851 Ops/s $\color{#d91a1a}-0.12\%$
test_cat 51.1061ms 50.6460ms 19.7449 Ops/s 20.0458 Ops/s $\color{#d91a1a}-1.50\%$

@vmoens vmoens merged commit 6b2f290 into gh/vmoens/40/base Dec 19, 2024
49 of 55 checks passed
vmoens added a commit that referenced this pull request Dec 19, 2024
ghstack-source-id: 23fffb80e79bb839b34178cf5e20faea7a8115c5
Pull Request resolved: #1151
@vmoens vmoens deleted the gh/vmoens/40/head branch December 19, 2024 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants