Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Use correct default cuda device #1161

Merged
merged 1 commit into from
Jan 7, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 7, 2025

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 7, 2025
ghstack-source-id: 9afb5b03ddf75afec357e9e54caadfc92ebf4ded
Pull Request resolved: #1161
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 7, 2025
@vmoens vmoens linked an issue Jan 7, 2025 that may be closed by this pull request
3 tasks
@vmoens vmoens added the bug Something isn't working label Jan 7, 2025
@vmoens vmoens merged commit 967fb25 into gh/vmoens/42/base Jan 7, 2025
36 of 37 checks passed
vmoens added a commit that referenced this pull request Jan 7, 2025
ghstack-source-id: 9afb5b03ddf75afec357e9e54caadfc92ebf4ded
Pull Request resolved: #1161
@vmoens vmoens deleted the gh/vmoens/42/head branch January 7, 2025 11:28
Copy link

github-actions bot commented Jan 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}38$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 51.1560μs 22.5351μs 44.3752 KOps/s 47.5697 KOps/s $\textbf{\color{#d91a1a}-6.72\%}$
test_plain_set_stack_nested 66.8650μs 22.4793μs 44.4854 KOps/s 46.7984 KOps/s $\color{#d91a1a}-4.94\%$
test_plain_set_nested_inplace 56.7560μs 24.0769μs 41.5336 KOps/s 41.0582 KOps/s $\color{#35bf28}+1.16\%$
test_plain_set_stack_nested_inplace 79.6300μs 24.1452μs 41.4161 KOps/s 42.8944 KOps/s $\color{#d91a1a}-3.45\%$
test_items 25.1470μs 4.1751μs 239.5155 KOps/s 237.6470 KOps/s $\color{#35bf28}+0.79\%$
test_items_nested 0.6204ms 0.4024ms 2.4849 KOps/s 2.4560 KOps/s $\color{#35bf28}+1.18\%$
test_items_nested_locked 0.8290ms 0.4041ms 2.4748 KOps/s 2.4364 KOps/s $\color{#35bf28}+1.58\%$
test_items_nested_leaf 0.1474ms 77.3365μs 12.9305 KOps/s 12.8379 KOps/s $\color{#35bf28}+0.72\%$
test_items_stack_nested 0.8339ms 0.4048ms 2.4702 KOps/s 2.4538 KOps/s $\color{#35bf28}+0.67\%$
test_items_stack_nested_leaf 0.1500ms 78.8436μs 12.6833 KOps/s 12.5081 KOps/s $\color{#35bf28}+1.40\%$
test_items_stack_nested_locked 0.5260ms 0.4059ms 2.4636 KOps/s 2.4313 KOps/s $\color{#35bf28}+1.33\%$
test_keys 23.7340μs 3.4975μs 285.9178 KOps/s 279.5141 KOps/s $\color{#35bf28}+2.29\%$
test_keys_nested 0.2266ms 0.1636ms 6.1133 KOps/s 6.0532 KOps/s $\color{#35bf28}+0.99\%$
test_keys_nested_locked 0.8332ms 0.1710ms 5.8479 KOps/s 5.7794 KOps/s $\color{#35bf28}+1.18\%$
test_keys_nested_leaf 0.2238ms 0.1431ms 6.9858 KOps/s 6.8072 KOps/s $\color{#35bf28}+2.62\%$
test_keys_stack_nested 0.3428ms 0.1650ms 6.0609 KOps/s 5.9797 KOps/s $\color{#35bf28}+1.36\%$
test_keys_stack_nested_leaf 0.2068ms 0.1425ms 7.0188 KOps/s 6.9653 KOps/s $\color{#35bf28}+0.77\%$
test_keys_stack_nested_locked 0.3122ms 0.1708ms 5.8545 KOps/s 5.7856 KOps/s $\color{#35bf28}+1.19\%$
test_values 9.8466μs 1.0625μs 941.1791 KOps/s 914.7289 KOps/s $\color{#35bf28}+2.89\%$
test_values_nested 0.1354ms 62.3268μs 16.0445 KOps/s 15.8808 KOps/s $\color{#35bf28}+1.03\%$
test_values_nested_locked 0.1198ms 62.1124μs 16.0999 KOps/s 15.8111 KOps/s $\color{#35bf28}+1.83\%$
test_values_nested_leaf 0.1583ms 71.7873μs 13.9300 KOps/s 13.3383 KOps/s $\color{#35bf28}+4.44\%$
test_values_stack_nested 0.1215ms 62.5333μs 15.9915 KOps/s 15.7583 KOps/s $\color{#35bf28}+1.48\%$
test_values_stack_nested_leaf 0.1268ms 71.3636μs 14.0128 KOps/s 13.5710 KOps/s $\color{#35bf28}+3.26\%$
test_values_stack_nested_locked 0.1249ms 63.4397μs 15.7630 KOps/s 15.6838 KOps/s $\color{#35bf28}+0.51\%$
test_membership 2.5688μs 0.6980μs 1.4327 MOps/s 1.0864 MOps/s $\textbf{\color{#35bf28}+31.88\%}$
test_membership_nested 41.9290μs 2.9709μs 336.6016 KOps/s 335.6417 KOps/s $\color{#35bf28}+0.29\%$
test_membership_nested_leaf 28.4840μs 2.9429μs 339.8024 KOps/s 337.0828 KOps/s $\color{#35bf28}+0.81\%$
test_membership_stacked_nested 44.7940μs 2.9201μs 342.4536 KOps/s 331.6355 KOps/s $\color{#35bf28}+3.26\%$
test_membership_stacked_nested_leaf 27.2510μs 2.9032μs 344.4430 KOps/s 334.0766 KOps/s $\color{#35bf28}+3.10\%$
test_membership_nested_last 35.9270μs 4.3820μs 228.2047 KOps/s 224.2816 KOps/s $\color{#35bf28}+1.75\%$
test_membership_nested_leaf_last 29.4050μs 4.4280μs 225.8373 KOps/s 222.7880 KOps/s $\color{#35bf28}+1.37\%$
test_membership_stacked_nested_last 33.1320μs 4.4470μs 224.8683 KOps/s 224.7291 KOps/s $\color{#35bf28}+0.06\%$
test_membership_stacked_nested_leaf_last 47.1780μs 4.4900μs 222.7186 KOps/s 226.3257 KOps/s $\color{#d91a1a}-1.59\%$
test_nested_getleaf 53.2200μs 10.8259μs 92.3707 KOps/s 92.6302 KOps/s $\color{#d91a1a}-0.28\%$
test_nested_get 36.9690μs 10.2930μs 97.1532 KOps/s 99.4984 KOps/s $\color{#d91a1a}-2.36\%$
test_stacked_getleaf 35.4460μs 10.6700μs 93.7207 KOps/s 93.4567 KOps/s $\color{#35bf28}+0.28\%$
test_stacked_get 55.4740μs 10.2958μs 97.1273 KOps/s 97.8267 KOps/s $\color{#d91a1a}-0.71\%$
test_nested_getitemleaf 61.7690μs 11.1832μs 89.4201 KOps/s 89.7020 KOps/s $\color{#d91a1a}-0.31\%$
test_nested_getitem 31.8500μs 10.4766μs 95.4510 KOps/s 96.2504 KOps/s $\color{#d91a1a}-0.83\%$
test_stacked_getitemleaf 61.7850μs 10.7860μs 92.7124 KOps/s 88.8494 KOps/s $\color{#35bf28}+4.35\%$
test_stacked_getitem 35.3260μs 10.4255μs 95.9189 KOps/s 96.1860 KOps/s $\color{#d91a1a}-0.28\%$
test_lock_nested 6.8719ms 0.4735ms 2.1121 KOps/s 2.1847 KOps/s $\color{#d91a1a}-3.32\%$
test_lock_stack_nested 0.7278ms 0.4364ms 2.2916 KOps/s 2.3239 KOps/s $\color{#d91a1a}-1.39\%$
test_unlock_nested 0.9233ms 0.3875ms 2.5806 KOps/s 2.6492 KOps/s $\color{#d91a1a}-2.59\%$
test_unlock_stack_nested 0.6580ms 0.3554ms 2.8135 KOps/s 2.8729 KOps/s $\color{#d91a1a}-2.07\%$
test_flatten_speed 0.1757ms 0.1005ms 9.9536 KOps/s 9.9474 KOps/s $\color{#35bf28}+0.06\%$
test_unflatten_speed 0.6815ms 0.5323ms 1.8787 KOps/s 1.8383 KOps/s $\color{#35bf28}+2.20\%$
test_common_ops 1.7740ms 0.8290ms 1.2063 KOps/s 1.2919 KOps/s $\textbf{\color{#d91a1a}-6.63\%}$
test_creation 19.7370μs 2.4873μs 402.0424 KOps/s 394.1277 KOps/s $\color{#35bf28}+2.01\%$
test_creation_empty 35.8970μs 13.2192μs 75.6478 KOps/s 91.8668 KOps/s $\textbf{\color{#d91a1a}-17.65\%}$
test_creation_nested_1 44.7940μs 16.1109μs 62.0697 KOps/s 72.3997 KOps/s $\textbf{\color{#d91a1a}-14.27\%}$
test_creation_nested_2 74.1560μs 20.5064μs 48.7654 KOps/s 54.8130 KOps/s $\textbf{\color{#d91a1a}-11.03\%}$
test_clone 57.2670μs 13.4182μs 74.5254 KOps/s 72.6537 KOps/s $\color{#35bf28}+2.58\%$
test_getitem[int] 1.2903ms 13.2139μs 75.6780 KOps/s 77.5645 KOps/s $\color{#d91a1a}-2.43\%$
test_getitem[slice_int] 0.1468ms 25.1435μs 39.7718 KOps/s 41.4749 KOps/s $\color{#d91a1a}-4.11\%$
test_getitem[range] 0.1781ms 48.9485μs 20.4296 KOps/s 20.6459 KOps/s $\color{#d91a1a}-1.05\%$
test_getitem[tuple] 0.1391ms 20.8199μs 48.0310 KOps/s 49.1941 KOps/s $\color{#d91a1a}-2.36\%$
test_getitem[list] 0.3046ms 44.4635μs 22.4904 KOps/s 23.5438 KOps/s $\color{#d91a1a}-4.47\%$
test_setitem_dim[int] 64.4510μs 25.0852μs 39.8641 KOps/s 39.9649 KOps/s $\color{#d91a1a}-0.25\%$
test_setitem_dim[slice_int] 93.0350μs 51.6956μs 19.3440 KOps/s 19.9817 KOps/s $\color{#d91a1a}-3.19\%$
test_setitem_dim[range] 0.1264ms 74.5407μs 13.4155 KOps/s 13.7392 KOps/s $\color{#d91a1a}-2.36\%$
test_setitem_dim[tuple] 88.9870μs 40.6694μs 24.5885 KOps/s 25.4883 KOps/s $\color{#d91a1a}-3.53\%$
test_setitem 0.1426ms 21.7150μs 46.0511 KOps/s 49.3700 KOps/s $\textbf{\color{#d91a1a}-6.72\%}$
test_set 0.1177ms 21.3153μs 46.9146 KOps/s 50.3841 KOps/s $\textbf{\color{#d91a1a}-6.89\%}$
test_set_shared 3.7082ms 0.1734ms 5.7678 KOps/s 5.7862 KOps/s $\color{#d91a1a}-0.32\%$
test_update 0.1763ms 25.5489μs 39.1406 KOps/s 45.7166 KOps/s $\textbf{\color{#d91a1a}-14.38\%}$
test_update_nested 0.1518ms 35.6299μs 28.0663 KOps/s 30.9256 KOps/s $\textbf{\color{#d91a1a}-9.25\%}$
test_update__nested 0.5394ms 34.6944μs 28.8231 KOps/s 29.3627 KOps/s $\color{#d91a1a}-1.84\%$
test_set_nested 81.9420μs 23.6226μs 42.3323 KOps/s 45.6352 KOps/s $\textbf{\color{#d91a1a}-7.24\%}$
test_set_nested_new 0.1195ms 28.4904μs 35.0995 KOps/s 37.1322 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_select 0.1443ms 45.0277μs 22.2086 KOps/s 22.8965 KOps/s $\color{#d91a1a}-3.00\%$
test_select_nested 0.1253ms 62.6160μs 15.9704 KOps/s 15.3775 KOps/s $\color{#35bf28}+3.86\%$
test_exclude_nested 0.1485ms 81.2071μs 12.3142 KOps/s 12.1456 KOps/s $\color{#35bf28}+1.39\%$
test_empty[True] 0.6916ms 0.4083ms 2.4491 KOps/s 2.3996 KOps/s $\color{#35bf28}+2.06\%$
test_empty[False] 13.7358μs 1.4261μs 701.1886 KOps/s 713.3399 KOps/s $\color{#d91a1a}-1.70\%$
test_unbind_speed 0.4804ms 0.2787ms 3.5878 KOps/s 3.6946 KOps/s $\color{#d91a1a}-2.89\%$
test_unbind_speed_stack0 0.4461ms 0.2775ms 3.6039 KOps/s 3.7021 KOps/s $\color{#d91a1a}-2.65\%$
test_unbind_speed_stack1 0.1129s 0.8364ms 1.1957 KOps/s 1.4921 KOps/s $\textbf{\color{#d91a1a}-19.87\%}$
test_split 2.5217ms 1.6137ms 619.6866 Ops/s 560.4651 Ops/s $\textbf{\color{#35bf28}+10.57\%}$
test_chunk 0.1126s 1.9692ms 507.8129 Ops/s 561.5805 Ops/s $\textbf{\color{#d91a1a}-9.57\%}$
test_consolidate_njt[False-None] 9.5909ms 8.3621ms 119.5873 Ops/s 120.8374 Ops/s $\color{#d91a1a}-1.03\%$
test_creation[device0] 0.2866ms 91.4746μs 10.9320 KOps/s 10.7942 KOps/s $\color{#35bf28}+1.28\%$
test_creation_from_tensor 4.7156ms 95.8346μs 10.4346 KOps/s 10.0323 KOps/s $\color{#35bf28}+4.01\%$
test_add_one[memmap_tensor0] 0.1503ms 5.2123μs 191.8542 KOps/s 202.2188 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_contiguous[memmap_tensor0] 12.4330μs 0.5309μs 1.8836 MOps/s 1.8836 MOps/s $-0.00\%$
test_stack[memmap_tensor0] 60.0620μs 3.5750μs 279.7236 KOps/s 281.2384 KOps/s $\color{#d91a1a}-0.54\%$
test_memmaptd_index 1.1016ms 0.2450ms 4.0824 KOps/s 4.1706 KOps/s $\color{#d91a1a}-2.12\%$
test_memmaptd_index_astensor 0.6109ms 0.3337ms 2.9964 KOps/s 3.0596 KOps/s $\color{#d91a1a}-2.07\%$
test_memmaptd_index_op 1.0577ms 0.6416ms 1.5587 KOps/s 1.6816 KOps/s $\textbf{\color{#d91a1a}-7.31\%}$
test_serialize_model 0.1267s 0.1190s 8.4027 Ops/s 7.3436 Ops/s $\textbf{\color{#35bf28}+14.42\%}$
test_serialize_model_pickle 0.4438s 0.3859s 2.5915 Ops/s 2.5160 Ops/s $\color{#35bf28}+3.00\%$
test_serialize_weights 0.1276s 0.1173s 8.5260 Ops/s 8.5367 Ops/s $\color{#d91a1a}-0.12\%$
test_serialize_weights_returnearly 0.1821s 0.1630s 6.1334 Ops/s 6.3767 Ops/s $\color{#d91a1a}-3.82\%$
test_serialize_weights_pickle 0.5383s 0.4430s 2.2575 Ops/s 2.4767 Ops/s $\textbf{\color{#d91a1a}-8.85\%}$
test_serialize_weights_filesystem 0.1524s 0.1460s 6.8482 Ops/s 6.9318 Ops/s $\color{#d91a1a}-1.21\%$
test_serialize_model_filesystem 0.1627s 0.1521s 6.5742 Ops/s 6.3939 Ops/s $\color{#35bf28}+2.82\%$
test_reshape_pytree 57.9790μs 26.7320μs 37.4083 KOps/s 37.3594 KOps/s $\color{#35bf28}+0.13\%$
test_reshape_td 83.9470μs 33.7528μs 29.6271 KOps/s 30.3985 KOps/s $\color{#d91a1a}-2.54\%$
test_view_pytree 70.3820μs 26.9967μs 37.0415 KOps/s 37.3066 KOps/s $\color{#d91a1a}-0.71\%$
test_view_td 95.9700μs 38.0381μs 26.2895 KOps/s 25.6340 KOps/s $\color{#35bf28}+2.56\%$
test_unbind_pytree 92.3330μs 29.7503μs 33.6132 KOps/s 33.3326 KOps/s $\color{#35bf28}+0.84\%$
test_unbind_td 0.3678ms 41.2075μs 24.2674 KOps/s 25.5747 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_split_pytree 61.7150μs 29.7283μs 33.6379 KOps/s 34.1518 KOps/s $\color{#d91a1a}-1.50\%$
test_split_td 0.6003ms 45.4562μs 21.9992 KOps/s 22.3536 KOps/s $\color{#d91a1a}-1.59\%$
test_add_pytree 88.6460μs 35.6102μs 28.0819 KOps/s 28.0746 KOps/s $\color{#35bf28}+0.03\%$
test_add_td 0.1467ms 60.9259μs 16.4134 KOps/s 18.0976 KOps/s $\textbf{\color{#d91a1a}-9.31\%}$
test_compile_add_one_nested[tensordict-compile] 0.1265ms 63.2435μs 15.8119 KOps/s 15.7418 KOps/s $\color{#35bf28}+0.45\%$
test_compile_add_one_nested[tensordict-eager] 0.4521ms 0.1751ms 5.7105 KOps/s 5.9110 KOps/s $\color{#d91a1a}-3.39\%$
test_compile_add_one_nested[pytree-compile] 0.1598ms 46.1125μs 21.6861 KOps/s 22.0663 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_add_one_nested[pytree-eager] 0.2556ms 0.1200ms 8.3317 KOps/s 8.4309 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_copy_nested[tensordict-compile] 86.8830μs 25.7203μs 38.8799 KOps/s 37.9865 KOps/s $\color{#35bf28}+2.35\%$
test_compile_copy_nested[tensordict-eager] 0.1292ms 59.1181μs 16.9153 KOps/s 16.9534 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_copy_nested[pytree-compile] 0.1658ms 77.7741μs 12.8577 KOps/s 12.5977 KOps/s $\color{#35bf28}+2.06\%$
test_compile_copy_nested[pytree-eager] 0.1422ms 67.4701μs 14.8214 KOps/s 14.6598 KOps/s $\color{#35bf28}+1.10\%$
test_compile_add_one_flat[tensordict-compile] 0.1900ms 0.1064ms 9.3986 KOps/s 9.4933 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_add_one_flat[tensordict-eager] 0.4609ms 0.2220ms 4.5047 KOps/s 4.7070 KOps/s $\color{#d91a1a}-4.30\%$
test_compile_add_one_flat[tensorclass-compile] 99.0150μs 45.3961μs 22.0283 KOps/s 22.1877 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_add_one_flat[tensorclass-eager] 0.5749ms 68.2118μs 14.6602 KOps/s 15.7102 KOps/s $\textbf{\color{#d91a1a}-6.68\%}$
test_compile_add_one_flat[pytree-compile] 0.1949ms 0.1053ms 9.4965 KOps/s 9.8233 KOps/s $\color{#d91a1a}-3.33\%$
test_compile_add_one_flat[pytree-eager] 0.4097ms 0.2046ms 4.8880 KOps/s 4.9453 KOps/s $\color{#d91a1a}-1.16\%$
test_compile_add_self_flat[tensordict-eager] 0.4853ms 0.2332ms 4.2880 KOps/s 4.3105 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_add_self_flat[tensordict-compile] 0.2004ms 0.1058ms 9.4494 KOps/s 9.4713 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_add_self_flat[tensorclass-eager] 0.1555ms 60.7217μs 16.4686 KOps/s 17.1479 KOps/s $\color{#d91a1a}-3.96\%$
test_compile_add_self_flat[tensorclass-compile] 0.1033ms 49.0355μs 20.3934 KOps/s 21.0376 KOps/s $\color{#d91a1a}-3.06\%$
test_compile_add_self_flat[pytree-eager] 0.2514ms 0.1611ms 6.2068 KOps/s 6.3457 KOps/s $\color{#d91a1a}-2.19\%$
test_compile_add_self_flat[pytree-compile] 0.1943ms 0.1076ms 9.2958 KOps/s 9.5102 KOps/s $\color{#d91a1a}-2.25\%$
test_compile_copy_flat[tensordict-compile] 66.7760μs 22.6185μs 44.2116 KOps/s 46.4033 KOps/s $\color{#d91a1a}-4.72\%$
test_compile_copy_flat[tensordict-eager] 0.1462ms 66.1071μs 15.1270 KOps/s 15.3560 KOps/s $\color{#d91a1a}-1.49\%$
test_compile_copy_flat[pytree-compile] 0.1748ms 80.7990μs 12.3764 KOps/s 12.3707 KOps/s $\color{#35bf28}+0.05\%$
test_compile_copy_flat[pytree-eager] 0.1566ms 68.8963μs 14.5146 KOps/s 14.5927 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_assign_and_add[tensordict-compile] 0.4221ms 0.2158ms 4.6341 KOps/s 4.7858 KOps/s $\color{#d91a1a}-3.17\%$
test_compile_assign_and_add[tensordict-eager] 1.4794ms 1.3418ms 745.2697 Ops/s 762.9671 Ops/s $\color{#d91a1a}-2.32\%$
test_compile_assign_and_add[pytree-compile] 0.3074ms 0.2070ms 4.8311 KOps/s 4.9290 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_assign_and_add[pytree-eager] 1.0370ms 0.7967ms 1.2552 KOps/s 1.2854 KOps/s $\color{#d91a1a}-2.35\%$
test_compile_assign_and_add_stack[compile] 1.0400ms 0.4693ms 2.1307 KOps/s 2.1852 KOps/s $\color{#d91a1a}-2.49\%$
test_compile_assign_and_add_stack[eager] 3.9153ms 2.9981ms 333.5419 Ops/s 379.7694 Ops/s $\textbf{\color{#d91a1a}-12.17\%}$
test_compile_indexing[tensor-tensordict-compile] 90.1890μs 37.3387μs 26.7819 KOps/s 27.9863 KOps/s $\color{#d91a1a}-4.30\%$
test_compile_indexing[tensor-tensordict-eager] 0.5922ms 35.0066μs 28.5661 KOps/s 30.5827 KOps/s $\textbf{\color{#d91a1a}-6.59\%}$
test_compile_indexing[tensor-tensorclass-compile] 98.7750μs 30.3378μs 32.9621 KOps/s 34.7493 KOps/s $\textbf{\color{#d91a1a}-5.14\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.1106ms 23.4013μs 42.7326 KOps/s 40.2818 KOps/s $\textbf{\color{#35bf28}+6.08\%}$
test_compile_indexing[tensor-pytree-compile] 81.7230μs 31.3010μs 31.9479 KOps/s 34.1422 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_compile_indexing[tensor-pytree-eager] 62.9980μs 23.4776μs 42.5938 KOps/s 43.3736 KOps/s $\color{#d91a1a}-1.80\%$
test_compile_indexing[slice-tensordict-compile] 0.1472ms 53.8731μs 18.5621 KOps/s 19.9375 KOps/s $\textbf{\color{#d91a1a}-6.90\%}$
test_compile_indexing[slice-tensordict-eager] 0.5607ms 20.0367μs 49.9083 KOps/s 50.2563 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_indexing[slice-tensorclass-compile] 0.1355ms 45.5639μs 21.9472 KOps/s 23.2529 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_compile_indexing[slice-tensorclass-eager] 57.5480μs 19.0411μs 52.5178 KOps/s 53.0936 KOps/s $\color{#d91a1a}-1.08\%$
test_compile_indexing[slice-pytree-compile] 0.1036ms 47.1113μs 21.2263 KOps/s 22.9501 KOps/s $\textbf{\color{#d91a1a}-7.51\%}$
test_compile_indexing[slice-pytree-eager] 67.9180μs 18.8082μs 53.1684 KOps/s 53.1754 KOps/s $\color{#d91a1a}-0.01\%$
test_compile_indexing[int-tensordict-compile] 0.1354ms 55.0499μs 18.1653 KOps/s 19.6731 KOps/s $\textbf{\color{#d91a1a}-7.66\%}$
test_compile_indexing[int-tensordict-eager] 1.0639ms 20.1245μs 49.6907 KOps/s 50.3605 KOps/s $\color{#d91a1a}-1.33\%$
test_compile_indexing[int-tensorclass-compile] 0.1570ms 46.4096μs 21.5473 KOps/s 23.0145 KOps/s $\textbf{\color{#d91a1a}-6.38\%}$
test_compile_indexing[int-tensorclass-eager] 65.9240μs 18.6845μs 53.5202 KOps/s 53.2914 KOps/s $\color{#35bf28}+0.43\%$
test_compile_indexing[int-pytree-compile] 0.1522ms 46.2260μs 21.6328 KOps/s 22.9859 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_compile_indexing[int-pytree-eager] 0.5755ms 18.8678μs 53.0003 KOps/s 53.0434 KOps/s $\color{#d91a1a}-0.08\%$
test_mod_add[eager] 94.4570μs 36.8455μs 27.1404 KOps/s 29.8416 KOps/s $\textbf{\color{#d91a1a}-9.05\%}$
test_mod_add[compile] 99.8670μs 49.8559μs 20.0578 KOps/s 20.8412 KOps/s $\color{#d91a1a}-3.76\%$
test_mod_add[compile-overhead] 0.1357ms 50.2008μs 19.9200 KOps/s 20.7435 KOps/s $\color{#d91a1a}-3.97\%$
test_mod_wrap[eager] 0.4473ms 0.2334ms 4.2845 KOps/s 4.4548 KOps/s $\color{#d91a1a}-3.82\%$
test_mod_wrap[compile] 0.3487ms 0.2114ms 4.7304 KOps/s 4.7903 KOps/s $\color{#d91a1a}-1.25\%$
test_mod_wrap[compile-overhead] 0.3884ms 0.2098ms 4.7665 KOps/s 4.8886 KOps/s $\color{#d91a1a}-2.50\%$
test_mod_wrap_and_backward[eager] 22.4251ms 13.5377ms 73.8677 Ops/s 83.0659 Ops/s $\textbf{\color{#d91a1a}-11.07\%}$
test_mod_wrap_and_backward[compile] 22.5372ms 14.2634ms 70.1097 Ops/s 82.3573 Ops/s $\textbf{\color{#d91a1a}-14.87\%}$
test_mod_wrap_and_backward[compile-overhead] 15.3021ms 12.6371ms 79.1319 Ops/s 72.1689 Ops/s $\textbf{\color{#35bf28}+9.65\%}$
test_seq_add[eager] 0.2914ms 0.1201ms 8.3268 KOps/s 8.7889 KOps/s $\textbf{\color{#d91a1a}-5.26\%}$
test_seq_add[compile] 0.1310ms 64.9783μs 15.3898 KOps/s 16.0996 KOps/s $\color{#d91a1a}-4.41\%$
test_seq_add[compile-overhead] 0.1404ms 62.8732μs 15.9050 KOps/s 16.6964 KOps/s $\color{#d91a1a}-4.74\%$
test_seq_wrap[eager] 0.6339ms 0.4651ms 2.1502 KOps/s 2.2655 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_seq_wrap[compile] 0.4797ms 0.2369ms 4.2215 KOps/s 4.3921 KOps/s $\color{#d91a1a}-3.88\%$
test_seq_wrap[compile-overhead] 0.4058ms 0.2362ms 4.2345 KOps/s 4.4192 KOps/s $\color{#d91a1a}-4.18\%$
test_func_call_runtime[False-eager] 0.8034ms 0.5599ms 1.7859 KOps/s 1.8637 KOps/s $\color{#d91a1a}-4.18\%$
test_func_call_runtime[False-compile] 0.8874ms 0.4474ms 2.2350 KOps/s 2.3437 KOps/s $\color{#d91a1a}-4.64\%$
test_func_call_runtime[False-compile-overhead] 0.5714ms 0.4363ms 2.2920 KOps/s 2.3395 KOps/s $\color{#d91a1a}-2.03\%$
test_func_call_runtime[True-eager] 1.5906ms 0.8049ms 1.2424 KOps/s 1.3138 KOps/s $\textbf{\color{#d91a1a}-5.44\%}$
test_func_call_runtime[True-compile] 0.6362ms 0.4814ms 2.0775 KOps/s 2.1385 KOps/s $\color{#d91a1a}-2.85\%$
test_func_call_runtime[True-compile-overhead] 0.6165ms 0.4825ms 2.0725 KOps/s 2.1404 KOps/s $\color{#d91a1a}-3.17\%$
test_func_call_cm_runtime[False-eager] 0.9684ms 0.5686ms 1.7587 KOps/s 1.8389 KOps/s $\color{#d91a1a}-4.36\%$
test_func_call_cm_runtime[False-compile] 0.5519ms 0.4386ms 2.2799 KOps/s 2.3421 KOps/s $\color{#d91a1a}-2.65\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6015ms 0.4352ms 2.2977 KOps/s 2.3479 KOps/s $\color{#d91a1a}-2.14\%$
test_func_call_cm_runtime[True-eager] 1.1695ms 0.9301ms 1.0752 KOps/s 1.0866 KOps/s $\color{#d91a1a}-1.05\%$
test_func_call_cm_runtime[True-compile] 0.8407ms 0.5093ms 1.9635 KOps/s 2.0281 KOps/s $\color{#d91a1a}-3.18\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6378ms 0.5050ms 1.9802 KOps/s 2.0373 KOps/s $\color{#d91a1a}-2.80\%$
test_vmap_func_call_cm_runtime[eager] 3.3414ms 2.0166ms 495.8930 Ops/s 516.2709 Ops/s $\color{#d91a1a}-3.95\%$
test_vmap_func_call_cm_runtime[compile] 0.9404ms 0.5368ms 1.8630 KOps/s 1.8899 KOps/s $\color{#d91a1a}-1.43\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6429ms 0.5307ms 1.8844 KOps/s 1.9149 KOps/s $\color{#d91a1a}-1.59\%$
test_distributed 0.3771ms 0.1260ms 7.9393 KOps/s 7.8254 KOps/s $\color{#35bf28}+1.46\%$
test_tdmodule 44.9340μs 26.9087μs 37.1628 KOps/s 38.3178 KOps/s $\color{#d91a1a}-3.01\%$
test_tdmodule_dispatch 82.9250μs 50.0307μs 19.9877 KOps/s 19.2552 KOps/s $\color{#35bf28}+3.80\%$
test_tdseq 51.7570μs 30.3095μs 32.9930 KOps/s 35.3183 KOps/s $\textbf{\color{#d91a1a}-6.58\%}$
test_tdseq_dispatch 81.3530μs 56.2087μs 17.7909 KOps/s 19.0982 KOps/s $\textbf{\color{#d91a1a}-6.85\%}$
test_instantiation_functorch 3.2451ms 1.5565ms 642.4553 Ops/s 657.5907 Ops/s $\color{#d91a1a}-2.30\%$
test_exec_functorch 0.3565ms 0.1835ms 5.4509 KOps/s 5.5725 KOps/s $\color{#d91a1a}-2.18\%$
test_exec_functional_call 0.2902ms 0.1746ms 5.7277 KOps/s 5.9685 KOps/s $\color{#d91a1a}-4.03\%$
test_exec_td_decorator 0.5495ms 0.2381ms 4.2002 KOps/s 4.3441 KOps/s $\color{#d91a1a}-3.31\%$
test_vmap_mlp_speed_decorator[True-True] 0.8649ms 0.6620ms 1.5106 KOps/s 1.5120 KOps/s $\color{#d91a1a}-0.09\%$
test_vmap_mlp_speed_decorator[True-False] 5.8787ms 0.6685ms 1.4959 KOps/s 1.4905 KOps/s $\color{#35bf28}+0.36\%$
test_vmap_mlp_speed_decorator[False-True] 1.0137ms 0.5410ms 1.8483 KOps/s 1.8684 KOps/s $\color{#d91a1a}-1.07\%$
test_vmap_mlp_speed_decorator[False-False] 0.8861ms 0.5354ms 1.8678 KOps/s 1.8738 KOps/s $\color{#d91a1a}-0.32\%$
test_to_module_speed[True] 1.9430ms 1.3529ms 739.1685 Ops/s 738.1826 Ops/s $\color{#35bf28}+0.13\%$
test_to_module_speed[False] 1.7953ms 1.3340ms 749.6432 Ops/s 762.2430 Ops/s $\color{#d91a1a}-1.65\%$
test_tc_init 87.7650μs 47.8508μs 20.8983 KOps/s 20.2492 KOps/s $\color{#35bf28}+3.21\%$
test_tc_init_nested 0.1683ms 94.2051μs 10.6151 KOps/s 10.4695 KOps/s $\color{#35bf28}+1.39\%$
test_tc_first_layer_tensor 13.1650μs 1.5503μs 645.0163 KOps/s 620.3138 KOps/s $\color{#35bf28}+3.98\%$
test_tc_first_layer_nontensor 34.0840μs 4.7412μs 210.9162 KOps/s 209.9560 KOps/s $\color{#35bf28}+0.46\%$
test_tc_second_layer_tensor 32.2310μs 2.8773μs 347.5485 KOps/s 336.3533 KOps/s $\color{#35bf28}+3.33\%$
test_tc_second_layer_nontensor 37.9710μs 6.0931μs 164.1213 KOps/s 163.2234 KOps/s $\color{#35bf28}+0.55\%$
test_unbind 0.2322s 14.7087ms 67.9871 Ops/s 74.5443 Ops/s $\textbf{\color{#d91a1a}-8.80\%}$
test_full_like 11.1449ms 9.3367ms 107.1043 Ops/s 82.8222 Ops/s $\textbf{\color{#35bf28}+29.32\%}$
test_zeros_like 4.0458ms 3.4781ms 287.5105 Ops/s 134.4528 Ops/s $\textbf{\color{#35bf28}+113.84\%}$
test_ones_like 4.3997ms 3.8858ms 257.3488 Ops/s 119.6149 Ops/s $\textbf{\color{#35bf28}+115.15\%}$
test_clone 9.1585ms 5.7859ms 172.8333 Ops/s 97.7328 Ops/s $\textbf{\color{#35bf28}+76.84\%}$
test_squeeze 70.8930μs 12.3729μs 80.8219 KOps/s 80.6264 KOps/s $\color{#35bf28}+0.24\%$
test_unsqueeze 0.1967ms 94.9608μs 10.5307 KOps/s 10.6672 KOps/s $\color{#d91a1a}-1.28\%$
test_split 0.5240ms 0.1979ms 5.0521 KOps/s 5.0933 KOps/s $\color{#d91a1a}-0.81\%$
test_permute 0.4082ms 0.2131ms 4.6916 KOps/s 4.7196 KOps/s $\color{#d91a1a}-0.59\%$
test_stack 28.3458ms 24.0717ms 41.5426 Ops/s 36.9314 Ops/s $\textbf{\color{#35bf28}+12.49\%}$
test_cat 29.1371ms 23.6444ms 42.2934 Ops/s 38.0193 Ops/s $\textbf{\color{#35bf28}+11.24\%}$

Copy link

github-actions bot commented Jan 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}57$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 36.4500μs 11.3558μs 88.0608 KOps/s 76.9000 KOps/s $\textbf{\color{#35bf28}+14.51\%}$
test_plain_set_stack_nested 34.7110μs 11.4686μs 87.1944 KOps/s 74.7964 KOps/s $\textbf{\color{#35bf28}+16.58\%}$
test_plain_set_nested_inplace 44.2710μs 12.5857μs 79.4551 KOps/s 69.4498 KOps/s $\textbf{\color{#35bf28}+14.41\%}$
test_plain_set_stack_nested_inplace 43.8610μs 12.5094μs 79.9396 KOps/s 69.2494 KOps/s $\textbf{\color{#35bf28}+15.44\%}$
test_items 37.1300μs 2.8507μs 350.7877 KOps/s 346.2051 KOps/s $\color{#35bf28}+1.32\%$
test_items_nested 0.4137ms 0.3539ms 2.8253 KOps/s 2.7584 KOps/s $\color{#35bf28}+2.43\%$
test_items_nested_locked 0.3947ms 0.3575ms 2.7972 KOps/s 2.7601 KOps/s $\color{#35bf28}+1.34\%$
test_items_nested_leaf 79.8820μs 58.0874μs 17.2155 KOps/s 16.9634 KOps/s $\color{#35bf28}+1.49\%$
test_items_stack_nested 0.4154ms 0.3580ms 2.7929 KOps/s 2.7742 KOps/s $\color{#35bf28}+0.67\%$
test_items_stack_nested_leaf 86.4620μs 59.7525μs 16.7357 KOps/s 16.7515 KOps/s $\color{#d91a1a}-0.09\%$
test_items_stack_nested_locked 0.4082ms 0.3593ms 2.7828 KOps/s 2.7632 KOps/s $\color{#35bf28}+0.71\%$
test_keys 36.9210μs 3.4440μs 290.3596 KOps/s 289.5189 KOps/s $\color{#35bf28}+0.29\%$
test_keys_nested 0.1148ms 82.5243μs 12.1176 KOps/s 11.9565 KOps/s $\color{#35bf28}+1.35\%$
test_keys_nested_locked 0.7744ms 88.0926μs 11.3517 KOps/s 11.1141 KOps/s $\color{#35bf28}+2.14\%$
test_keys_nested_leaf 97.7920μs 72.8485μs 13.7271 KOps/s 13.5217 KOps/s $\color{#35bf28}+1.52\%$
test_keys_stack_nested 0.1165ms 84.0121μs 11.9031 KOps/s 11.7067 KOps/s $\color{#35bf28}+1.68\%$
test_keys_stack_nested_leaf 0.1112ms 75.1910μs 13.2995 KOps/s 13.2321 KOps/s $\color{#35bf28}+0.51\%$
test_keys_stack_nested_locked 0.1390ms 90.3628μs 11.0665 KOps/s 10.9628 KOps/s $\color{#35bf28}+0.95\%$
test_values 5.5868μs 0.8479μs 1.1794 MOps/s 1.1742 MOps/s $\color{#35bf28}+0.44\%$
test_values_nested 57.1220μs 34.1095μs 29.3174 KOps/s 28.8130 KOps/s $\color{#35bf28}+1.75\%$
test_values_nested_locked 58.0110μs 35.5759μs 28.1089 KOps/s 27.5937 KOps/s $\color{#35bf28}+1.87\%$
test_values_nested_leaf 64.2410μs 38.5986μs 25.9076 KOps/s 25.6502 KOps/s $\color{#35bf28}+1.00\%$
test_values_stack_nested 65.0810μs 34.7936μs 28.7409 KOps/s 28.2639 KOps/s $\color{#35bf28}+1.69\%$
test_values_stack_nested_leaf 71.8910μs 39.3061μs 25.4414 KOps/s 25.1332 KOps/s $\color{#35bf28}+1.23\%$
test_values_stack_nested_locked 66.9110μs 36.2481μs 27.5876 KOps/s 27.0235 KOps/s $\color{#35bf28}+2.09\%$
test_membership 1.8090μs 0.5066μs 1.9738 MOps/s 1.9865 MOps/s $\color{#d91a1a}-0.64\%$
test_membership_nested 22.0755μs 2.0019μs 499.5249 KOps/s 496.2401 KOps/s $\color{#35bf28}+0.66\%$
test_membership_nested_leaf 17.2205μs 2.0111μs 497.2368 KOps/s 501.4104 KOps/s $\color{#d91a1a}-0.83\%$
test_membership_stacked_nested 29.7710μs 2.1176μs 472.2414 KOps/s 473.1745 KOps/s $\color{#d91a1a}-0.20\%$
test_membership_stacked_nested_leaf 33.7810μs 2.0983μs 476.5803 KOps/s 470.5908 KOps/s $\color{#35bf28}+1.27\%$
test_membership_nested_last 27.7410μs 3.1257μs 319.9289 KOps/s 319.1881 KOps/s $\color{#35bf28}+0.23\%$
test_membership_nested_leaf_last 44.3710μs 3.0674μs 326.0037 KOps/s 315.3605 KOps/s $\color{#35bf28}+3.37\%$
test_membership_stacked_nested_last 42.9410μs 4.2987μs 232.6282 KOps/s 264.2891 KOps/s $\textbf{\color{#d91a1a}-11.98\%}$
test_membership_stacked_nested_leaf_last 38.1700μs 4.2660μs 234.4135 KOps/s 269.6416 KOps/s $\textbf{\color{#d91a1a}-13.06\%}$
test_nested_getleaf 25.0010μs 6.2616μs 159.7034 KOps/s 160.0593 KOps/s $\color{#d91a1a}-0.22\%$
test_nested_get 32.7510μs 5.8615μs 170.6048 KOps/s 170.8117 KOps/s $\color{#d91a1a}-0.12\%$
test_stacked_getleaf 28.1610μs 6.1612μs 162.3054 KOps/s 161.4050 KOps/s $\color{#35bf28}+0.56\%$
test_stacked_get 39.1310μs 5.8676μs 170.4286 KOps/s 171.2666 KOps/s $\color{#d91a1a}-0.49\%$
test_nested_getitemleaf 26.8400μs 6.2606μs 159.7280 KOps/s 157.6464 KOps/s $\color{#35bf28}+1.32\%$
test_nested_getitem 29.4810μs 5.9110μs 169.1756 KOps/s 165.8801 KOps/s $\color{#35bf28}+1.99\%$
test_stacked_getitemleaf 31.3200μs 6.2431μs 160.1773 KOps/s 159.8281 KOps/s $\color{#35bf28}+0.22\%$
test_stacked_getitem 28.6000μs 5.9203μs 168.9106 KOps/s 167.2234 KOps/s $\color{#35bf28}+1.01\%$
test_lock_nested 9.1786ms 0.3887ms 2.5724 KOps/s 2.6062 KOps/s $\color{#d91a1a}-1.29\%$
test_lock_stack_nested 0.4024ms 0.3483ms 2.8712 KOps/s 2.7934 KOps/s $\color{#35bf28}+2.79\%$
test_unlock_nested 0.7284ms 0.3209ms 3.1159 KOps/s 3.0679 KOps/s $\color{#35bf28}+1.56\%$
test_unlock_stack_nested 0.3453ms 0.2864ms 3.4912 KOps/s 3.4089 KOps/s $\color{#35bf28}+2.41\%$
test_flatten_speed 0.1321ms 75.8709μs 13.1803 KOps/s 13.1941 KOps/s $\color{#d91a1a}-0.10\%$
test_unflatten_speed 0.3707ms 0.3224ms 3.1021 KOps/s 3.0476 KOps/s $\color{#35bf28}+1.79\%$
test_common_ops 92.5933ms 0.6343ms 1.5765 KOps/s 1.5312 KOps/s $\color{#35bf28}+2.96\%$
test_creation 21.5210μs 1.7240μs 580.0603 KOps/s 574.1353 KOps/s $\color{#35bf28}+1.03\%$
test_creation_empty 35.5700μs 6.4745μs 154.4520 KOps/s 102.5737 KOps/s $\textbf{\color{#35bf28}+50.58\%}$
test_creation_nested_1 19.8300μs 8.3774μs 119.3695 KOps/s 87.1599 KOps/s $\textbf{\color{#35bf28}+36.95\%}$
test_creation_nested_2 0.1175ms 10.8855μs 91.8649 KOps/s 70.1057 KOps/s $\textbf{\color{#35bf28}+31.04\%}$
test_clone 88.6610μs 10.3506μs 96.6132 KOps/s 91.2443 KOps/s $\textbf{\color{#35bf28}+5.88\%}$
test_getitem[int] 0.9843ms 10.7888μs 92.6885 KOps/s 85.7663 KOps/s $\textbf{\color{#35bf28}+8.07\%}$
test_getitem[slice_int] 0.1028ms 20.9990μs 47.6214 KOps/s 45.0288 KOps/s $\textbf{\color{#35bf28}+5.76\%}$
test_getitem[range] 0.1229ms 36.4124μs 27.4632 KOps/s 26.7345 KOps/s $\color{#35bf28}+2.73\%$
test_getitem[tuple] 0.1029ms 18.3883μs 54.3824 KOps/s 50.5092 KOps/s $\textbf{\color{#35bf28}+7.67\%}$
test_getitem[list] 0.2052ms 32.2017μs 31.0543 KOps/s 30.1793 KOps/s $\color{#35bf28}+2.90\%$
test_setitem_dim[int] 39.7110μs 18.3593μs 54.4683 KOps/s 53.2331 KOps/s $\color{#35bf28}+2.32\%$
test_setitem_dim[slice_int] 70.7710μs 37.8523μs 26.4185 KOps/s 26.3013 KOps/s $\color{#35bf28}+0.45\%$
test_setitem_dim[range] 83.5020μs 50.4730μs 19.8126 KOps/s 19.1274 KOps/s $\color{#35bf28}+3.58\%$
test_setitem_dim[tuple] 63.3310μs 30.5977μs 32.6822 KOps/s 30.7832 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_setitem 70.1720μs 13.4603μs 74.2928 KOps/s 62.1920 KOps/s $\textbf{\color{#35bf28}+19.46\%}$
test_set 96.9020μs 13.0624μs 76.5555 KOps/s 63.7627 KOps/s $\textbf{\color{#35bf28}+20.06\%}$
test_set_shared 1.5588ms 0.1510ms 6.6208 KOps/s 6.4339 KOps/s $\color{#35bf28}+2.91\%$
test_update 0.5644ms 14.7856μs 67.6334 KOps/s 50.9841 KOps/s $\textbf{\color{#35bf28}+32.66\%}$
test_update_nested 1.1266ms 20.2293μs 49.4333 KOps/s 39.4320 KOps/s $\textbf{\color{#35bf28}+25.36\%}$
test_update__nested 72.7120μs 24.5746μs 40.6924 KOps/s 37.9223 KOps/s $\textbf{\color{#35bf28}+7.30\%}$
test_set_nested 73.7410μs 14.4379μs 69.2623 KOps/s 59.7168 KOps/s $\textbf{\color{#35bf28}+15.98\%}$
test_set_nested_new 84.0610μs 16.9422μs 59.0242 KOps/s 52.2162 KOps/s $\textbf{\color{#35bf28}+13.04\%}$
test_select 84.4020μs 28.7850μs 34.7403 KOps/s 32.0038 KOps/s $\textbf{\color{#35bf28}+8.55\%}$
test_select_nested 68.8820μs 45.0094μs 22.2176 KOps/s 22.1125 KOps/s $\color{#35bf28}+0.48\%$
test_exclude_nested 90.5120μs 62.5920μs 15.9765 KOps/s 15.7258 KOps/s $\color{#35bf28}+1.59\%$
test_empty[True] 0.3242ms 0.2915ms 3.4309 KOps/s 3.4327 KOps/s $\color{#d91a1a}-0.05\%$
test_empty[False] 8.4431μs 0.8286μs 1.2069 MOps/s 1.1827 MOps/s $\color{#35bf28}+2.04\%$
test_to 86.0020μs 56.6783μs 17.6434 KOps/s 17.4669 KOps/s $\color{#35bf28}+1.01\%$
test_to_nonblocking 80.1520μs 47.3804μs 21.1058 KOps/s 19.3275 KOps/s $\textbf{\color{#35bf28}+9.20\%}$
test_unbind_speed 1.3581ms 0.2410ms 4.1494 KOps/s 4.0301 KOps/s $\color{#35bf28}+2.96\%$
test_unbind_speed_stack0 0.2739ms 0.2428ms 4.1178 KOps/s 4.0081 KOps/s $\color{#35bf28}+2.74\%$
test_unbind_speed_stack1 92.3831ms 0.6651ms 1.5036 KOps/s 1.4641 KOps/s $\color{#35bf28}+2.70\%$
test_split 93.2546ms 1.5927ms 627.8688 Ops/s 593.2662 Ops/s $\textbf{\color{#35bf28}+5.83\%}$
test_chunk 95.5949ms 1.6186ms 617.8090 Ops/s 596.3680 Ops/s $\color{#35bf28}+3.60\%$
test_consolidate[False-None] 95.8422ms 2.9972ms 333.6476 Ops/s 318.8718 Ops/s $\color{#35bf28}+4.63\%$
test_consolidate[default-None] 1.9223ms 1.6891ms 592.0276 Ops/s 565.0876 Ops/s $\color{#35bf28}+4.77\%$
test_consolidate[reduce-overhead-None] 1.8497ms 1.7271ms 579.0140 Ops/s 548.8652 Ops/s $\textbf{\color{#35bf28}+5.49\%}$
test_consolidate_njt[False-None] 7.5035ms 6.8714ms 145.5303 Ops/s 145.9922 Ops/s $\color{#d91a1a}-0.32\%$
test_to[False-False-None] 2.1966ms 1.7315ms 577.5344 Ops/s 554.3052 Ops/s $\color{#35bf28}+4.19\%$
test_to[True-False-None] 1.6149ms 1.3701ms 729.8822 Ops/s 715.0407 Ops/s $\color{#35bf28}+2.08\%$
test_to[within-False-None] 4.4118ms 4.1696ms 239.8299 Ops/s 227.2525 Ops/s $\textbf{\color{#35bf28}+5.53\%}$
test_to[True-default-None] 5.9816ms 5.7196ms 174.8374 Ops/s 181.4840 Ops/s $\color{#d91a1a}-3.66\%$
test_to_njt[False-False-None] 7.4593ms 7.0543ms 141.7576 Ops/s 141.3078 Ops/s $\color{#35bf28}+0.32\%$
test_to_njt[True-False-None] 6.5128ms 5.8143ms 171.9911 Ops/s 174.7113 Ops/s $\color{#d91a1a}-1.56\%$
test_to_njt[within-False-None] 13.3855ms 13.1587ms 75.9951 Ops/s 79.0094 Ops/s $\color{#d91a1a}-3.82\%$
test_creation[device0] 0.5525ms 85.7165μs 11.6664 KOps/s 12.4050 KOps/s $\textbf{\color{#d91a1a}-5.95\%}$
test_creation_from_tensor 0.5386ms 87.3424μs 11.4492 KOps/s 11.9036 KOps/s $\color{#d91a1a}-3.82\%$
test_add_one[memmap_tensor0] 0.2387ms 6.3427μs 157.6623 KOps/s 152.7964 KOps/s $\color{#35bf28}+3.18\%$
test_contiguous[memmap_tensor0] 2.1505μs 0.3982μs 2.5111 MOps/s 2.4950 MOps/s $\color{#35bf28}+0.64\%$
test_stack[memmap_tensor0] 30.1710μs 4.5478μs 219.8881 KOps/s 200.7368 KOps/s $\textbf{\color{#35bf28}+9.54\%}$
test_memmaptd_index 1.3982ms 0.2506ms 3.9912 KOps/s 3.6932 KOps/s $\textbf{\color{#35bf28}+8.07\%}$
test_memmaptd_index_astensor 0.5691ms 0.3090ms 3.2368 KOps/s 2.9564 KOps/s $\textbf{\color{#35bf28}+9.48\%}$
test_memmaptd_index_op 0.9965ms 0.5464ms 1.8300 KOps/s 1.6002 KOps/s $\textbf{\color{#35bf28}+14.36\%}$
test_serialize_model 0.1318s 0.1311s 7.6279 Ops/s 7.5810 Ops/s $\color{#35bf28}+0.62\%$
test_serialize_model_pickle 1.3462s 1.2129s 0.8245 Ops/s 0.8260 Ops/s $\color{#d91a1a}-0.18\%$
test_serialize_weights 0.4228s 0.1725s 5.7965 Ops/s 7.6057 Ops/s $\textbf{\color{#d91a1a}-23.79\%}$
test_serialize_weights_returnearly 0.3270s 56.0382ms 17.8450 Ops/s 22.9116 Ops/s $\textbf{\color{#d91a1a}-22.11\%}$
test_serialize_weights_pickle 1.3789s 1.2266s 0.8152 Ops/s 0.8401 Ops/s $\color{#d91a1a}-2.95\%$
test_reshape_pytree 68.4510μs 23.3998μs 42.7353 KOps/s 43.4002 KOps/s $\color{#d91a1a}-1.53\%$
test_reshape_td 78.3120μs 31.4405μs 31.8061 KOps/s 35.4129 KOps/s $\textbf{\color{#d91a1a}-10.18\%}$
test_view_pytree 67.3320μs 23.8026μs 42.0122 KOps/s 44.2706 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_view_td 76.7310μs 34.8799μs 28.6698 KOps/s 31.3985 KOps/s $\textbf{\color{#d91a1a}-8.69\%}$
test_unbind_pytree 58.1610μs 29.7103μs 33.6584 KOps/s 34.4897 KOps/s $\color{#d91a1a}-2.41\%$
test_unbind_td 0.7799ms 38.9448μs 25.6774 KOps/s 26.4355 KOps/s $\color{#d91a1a}-2.87\%$
test_split_pytree 62.7410μs 32.8287μs 30.4612 KOps/s 31.5781 KOps/s $\color{#d91a1a}-3.54\%$
test_split_td 0.9033ms 42.0729μs 23.7683 KOps/s 23.6284 KOps/s $\color{#35bf28}+0.59\%$
test_add_pytree 73.6510μs 34.5845μs 28.9147 KOps/s 29.1753 KOps/s $\color{#d91a1a}-0.89\%$
test_add_td 85.5210μs 45.0435μs 22.2008 KOps/s 19.7494 KOps/s $\textbf{\color{#35bf28}+12.41\%}$
test_compile_add_one_nested[tensordict-compile] 0.1749ms 0.1215ms 8.2326 KOps/s 7.4296 KOps/s $\textbf{\color{#35bf28}+10.81\%}$
test_compile_add_one_nested[tensordict-eager] 0.2252ms 0.1312ms 7.6191 KOps/s 7.1241 KOps/s $\textbf{\color{#35bf28}+6.95\%}$
test_compile_add_one_nested[pytree-compile] 0.1423ms 97.4993μs 10.2565 KOps/s 9.6458 KOps/s $\textbf{\color{#35bf28}+6.33\%}$
test_compile_add_one_nested[pytree-eager] 1.4158ms 0.1493ms 6.7000 KOps/s 6.1575 KOps/s $\textbf{\color{#35bf28}+8.81\%}$
test_compile_copy_nested[tensordict-compile] 52.5010μs 23.6029μs 42.3677 KOps/s 44.5295 KOps/s $\color{#d91a1a}-4.85\%$
test_compile_copy_nested[tensordict-eager] 60.6610μs 29.2167μs 34.2270 KOps/s 34.0333 KOps/s $\color{#35bf28}+0.57\%$
test_compile_copy_nested[pytree-compile] 0.3153ms 64.5449μs 15.4931 KOps/s 15.3326 KOps/s $\color{#35bf28}+1.05\%$
test_compile_copy_nested[pytree-eager] 0.1192ms 48.5559μs 20.5948 KOps/s 20.0716 KOps/s $\color{#35bf28}+2.61\%$
test_compile_add_one_flat[tensordict-compile] 0.2150ms 0.1468ms 6.8142 KOps/s 6.8384 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_add_one_flat[tensordict-eager] 0.3096ms 0.2146ms 4.6600 KOps/s 4.6512 KOps/s $\color{#35bf28}+0.19\%$
test_compile_add_one_flat[tensorclass-compile] 0.1614ms 98.3378μs 10.1690 KOps/s 10.0243 KOps/s $\color{#35bf28}+1.44\%$
test_compile_add_one_flat[tensorclass-eager] 0.1095ms 52.1866μs 19.1620 KOps/s 18.8257 KOps/s $\color{#35bf28}+1.79\%$
test_compile_add_one_flat[pytree-compile] 0.2019ms 0.1365ms 7.3253 KOps/s 7.1256 KOps/s $\color{#35bf28}+2.80\%$
test_compile_add_one_flat[pytree-eager] 0.5292ms 0.4750ms 2.1052 KOps/s 2.0027 KOps/s $\textbf{\color{#35bf28}+5.12\%}$
test_compile_add_self_flat[tensordict-eager] 0.3649ms 0.2588ms 3.8635 KOps/s 3.8750 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_add_self_flat[tensordict-compile] 0.1843ms 0.1438ms 6.9520 KOps/s 6.8862 KOps/s $\color{#35bf28}+0.96\%$
test_compile_add_self_flat[tensorclass-eager] 0.1451ms 64.1698μs 15.5836 KOps/s 15.3764 KOps/s $\color{#35bf28}+1.35\%$
test_compile_add_self_flat[tensorclass-compile] 0.1376ms 99.2934μs 10.0712 KOps/s 10.0285 KOps/s $\color{#35bf28}+0.43\%$
test_compile_add_self_flat[pytree-eager] 0.4551ms 0.4072ms 2.4555 KOps/s 2.3480 KOps/s $\color{#35bf28}+4.58\%$
test_compile_add_self_flat[pytree-compile] 0.1861ms 0.1394ms 7.1743 KOps/s 7.2401 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_copy_flat[tensordict-compile] 57.0210μs 19.2837μs 51.8572 KOps/s 55.0735 KOps/s $\textbf{\color{#d91a1a}-5.84\%}$
test_compile_copy_flat[tensordict-eager] 63.3810μs 32.0126μs 31.2377 KOps/s 31.8030 KOps/s $\color{#d91a1a}-1.78\%$
test_compile_copy_flat[pytree-compile] 0.1449ms 71.0364μs 14.0773 KOps/s 13.9710 KOps/s $\color{#35bf28}+0.76\%$
test_compile_copy_flat[pytree-eager] 0.1134ms 52.5160μs 19.0418 KOps/s 18.8360 KOps/s $\color{#35bf28}+1.09\%$
test_compile_assign_and_add[tensordict-compile] 1.6182ms 0.3894ms 2.5681 KOps/s 2.1823 KOps/s $\textbf{\color{#35bf28}+17.68\%}$
test_compile_assign_and_add[tensordict-eager] 2.6378ms 2.5687ms 389.3041 Ops/s 350.2480 Ops/s $\textbf{\color{#35bf28}+11.15\%}$
test_compile_assign_and_add[pytree-compile] 1.5921ms 0.3808ms 2.6262 KOps/s 2.2414 KOps/s $\textbf{\color{#35bf28}+17.17\%}$
test_compile_assign_and_add[pytree-eager] 2.8579ms 2.6273ms 380.6175 Ops/s 366.1222 Ops/s $\color{#35bf28}+3.96\%$
test_compile_indexing[tensor-tensordict-compile] 0.1652ms 0.1128ms 8.8655 KOps/s 8.3925 KOps/s $\textbf{\color{#35bf28}+5.64\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5588ms 82.6665μs 12.0968 KOps/s 12.3067 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1669ms 0.1143ms 8.7500 KOps/s 9.3946 KOps/s $\textbf{\color{#d91a1a}-6.86\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.1182ms 71.9232μs 13.9037 KOps/s 14.3003 KOps/s $\color{#d91a1a}-2.77\%$
test_compile_indexing[tensor-pytree-compile] 0.1675ms 0.1143ms 8.7515 KOps/s 8.9058 KOps/s $\color{#d91a1a}-1.73\%$
test_compile_indexing[tensor-pytree-eager] 0.1553ms 68.3241μs 14.6361 KOps/s 14.5792 KOps/s $\color{#35bf28}+0.39\%$
test_compile_indexing[slice-tensordict-compile] 0.1575ms 0.1016ms 9.8460 KOps/s 9.7309 KOps/s $\color{#35bf28}+1.18\%$
test_compile_indexing[slice-tensordict-eager] 0.1390ms 17.6077μs 56.7932 KOps/s 49.3673 KOps/s $\textbf{\color{#35bf28}+15.04\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1440ms 97.2085μs 10.2872 KOps/s 10.1311 KOps/s $\color{#35bf28}+1.54\%$
test_compile_indexing[slice-tensorclass-eager] 58.3810μs 15.8815μs 62.9665 KOps/s 60.6739 KOps/s $\color{#35bf28}+3.78\%$
test_compile_indexing[slice-pytree-compile] 0.1541ms 0.1008ms 9.9246 KOps/s 10.0258 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[slice-pytree-eager] 49.1410μs 16.0135μs 62.4474 KOps/s 59.3583 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_compile_indexing[int-tensordict-compile] 0.1650ms 0.1088ms 9.1880 KOps/s 9.6119 KOps/s $\color{#d91a1a}-4.41\%$
test_compile_indexing[int-tensordict-eager] 0.5992ms 18.5970μs 53.7720 KOps/s 53.9282 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_indexing[int-tensorclass-compile] 0.1546ms 0.1063ms 9.4116 KOps/s 9.9945 KOps/s $\textbf{\color{#d91a1a}-5.83\%}$
test_compile_indexing[int-tensorclass-eager] 38.9810μs 17.1412μs 58.3390 KOps/s 60.0756 KOps/s $\color{#d91a1a}-2.89\%$
test_compile_indexing[int-pytree-compile] 0.1512ms 0.1042ms 9.6005 KOps/s 9.9726 KOps/s $\color{#d91a1a}-3.73\%$
test_compile_indexing[int-pytree-eager] 57.3010μs 17.5552μs 56.9631 KOps/s 60.1543 KOps/s $\textbf{\color{#d91a1a}-5.31\%}$
test_mod_add[eager] 84.6320μs 36.5466μs 27.3624 KOps/s 25.8516 KOps/s $\textbf{\color{#35bf28}+5.84\%}$
test_mod_add[compile] 0.1157ms 81.3666μs 12.2901 KOps/s 12.0922 KOps/s $\color{#35bf28}+1.64\%$
test_mod_add[compile-overhead] 0.3328ms 0.1690ms 5.9174 KOps/s 5.5289 KOps/s $\textbf{\color{#35bf28}+7.03\%}$
test_mod_wrap[eager] 0.3335ms 0.2422ms 4.1289 KOps/s 3.9947 KOps/s $\color{#35bf28}+3.36\%$
test_mod_wrap[compile] 0.3519ms 0.2832ms 3.5308 KOps/s 3.4635 KOps/s $\color{#35bf28}+1.94\%$
test_mod_wrap[compile-overhead] 7.0198ms 3.7440ms 267.0957 Ops/s 271.4366 Ops/s $\color{#d91a1a}-1.60\%$
test_mod_wrap_and_backward[eager] 1.4322ms 1.3278ms 753.1200 Ops/s 693.8119 Ops/s $\textbf{\color{#35bf28}+8.55\%}$
test_mod_wrap_and_backward[compile] 1.6717ms 1.2611ms 792.9655 Ops/s 714.8099 Ops/s $\textbf{\color{#35bf28}+10.93\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3588ms 0.9300ms 1.0753 KOps/s 960.8276 Ops/s $\textbf{\color{#35bf28}+11.91\%}$
test_seq_add[eager] 0.1804ms 0.1174ms 8.5182 KOps/s 8.3718 KOps/s $\color{#35bf28}+1.75\%$
test_seq_add[compile] 0.2229ms 92.6358μs 10.7950 KOps/s 11.0663 KOps/s $\color{#d91a1a}-2.45\%$
test_seq_add[compile-overhead] 0.1718ms 0.1306ms 7.6588 KOps/s 7.4920 KOps/s $\color{#35bf28}+2.23\%$
test_seq_wrap[eager] 0.5385ms 0.4310ms 2.3199 KOps/s 2.3250 KOps/s $\color{#d91a1a}-0.22\%$
test_seq_wrap[compile] 0.3975ms 0.2990ms 3.3448 KOps/s 3.1797 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_seq_wrap[compile-overhead] 0.2922ms 0.2255ms 4.4339 KOps/s 4.3925 KOps/s $\color{#35bf28}+0.94\%$
test_func_call_runtime[False-eager] 0.7593ms 0.7145ms 1.3996 KOps/s 1.3106 KOps/s $\textbf{\color{#35bf28}+6.80\%}$
test_func_call_runtime[False-compile] 0.9726ms 0.7413ms 1.3489 KOps/s 1.3161 KOps/s $\color{#35bf28}+2.49\%$
test_func_call_runtime[False-compile-overhead] 0.4660ms 0.3674ms 2.7220 KOps/s 2.7340 KOps/s $\color{#d91a1a}-0.44\%$
test_func_call_runtime[True-eager] 0.9273ms 0.8734ms 1.1450 KOps/s 1.1084 KOps/s $\color{#35bf28}+3.30\%$
test_func_call_runtime[True-compile] 0.9921ms 0.7647ms 1.3077 KOps/s 1.2806 KOps/s $\color{#35bf28}+2.11\%$
test_func_call_runtime[True-compile-overhead] 0.5036ms 0.3872ms 2.5827 KOps/s 2.6006 KOps/s $\color{#d91a1a}-0.69\%$
test_func_call_cm_runtime[False-eager] 0.8479ms 0.7698ms 1.2990 KOps/s 1.3873 KOps/s $\textbf{\color{#d91a1a}-6.37\%}$
test_func_call_cm_runtime[False-compile] 0.8064ms 0.7449ms 1.3424 KOps/s 1.3113 KOps/s $\color{#35bf28}+2.37\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4137ms 0.3658ms 2.7339 KOps/s 2.7357 KOps/s $\color{#d91a1a}-0.07\%$
test_func_call_cm_runtime[True-eager] 1.2142ms 1.0326ms 968.4758 Ops/s 997.3958 Ops/s $\color{#d91a1a}-2.90\%$
test_func_call_cm_runtime[True-compile] 1.0435ms 0.7882ms 1.2688 KOps/s 1.2313 KOps/s $\color{#35bf28}+3.04\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4866ms 0.4097ms 2.4409 KOps/s 2.4213 KOps/s $\color{#35bf28}+0.81\%$
test_vmap_func_call_cm_runtime[eager] 2.5390ms 2.0431ms 489.4572 Ops/s 489.5119 Ops/s $\color{#d91a1a}-0.01\%$
test_vmap_func_call_cm_runtime[compile] 0.8766ms 0.7993ms 1.2511 KOps/s 1.2070 KOps/s $\color{#35bf28}+3.65\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4879ms 0.4134ms 2.4189 KOps/s 2.4067 KOps/s $\color{#35bf28}+0.51\%$
test_distributed 4.2159ms 0.1779ms 5.6218 KOps/s 8.3170 KOps/s $\textbf{\color{#d91a1a}-32.41\%}$
test_tdmodule 48.1110μs 18.6723μs 53.5552 KOps/s 47.5289 KOps/s $\textbf{\color{#35bf28}+12.68\%}$
test_tdmodule_dispatch 66.2310μs 33.6452μs 29.7219 KOps/s 26.0775 KOps/s $\textbf{\color{#35bf28}+13.98\%}$
test_tdseq 48.9110μs 19.8365μs 50.4122 KOps/s 44.3841 KOps/s $\textbf{\color{#35bf28}+13.58\%}$
test_tdseq_dispatch 66.7920μs 36.8634μs 27.1272 KOps/s 23.9029 KOps/s $\textbf{\color{#35bf28}+13.49\%}$
test_instantiation_functorch 1.6662ms 1.5607ms 640.7190 Ops/s 626.1298 Ops/s $\color{#35bf28}+2.33\%$
test_exec_functorch 0.2026ms 0.1433ms 6.9784 KOps/s 7.0126 KOps/s $\color{#d91a1a}-0.49\%$
test_exec_functional_call 0.1873ms 0.1350ms 7.4065 KOps/s 7.4140 KOps/s $\color{#d91a1a}-0.10\%$
test_exec_td_decorator 0.3791ms 0.1842ms 5.4282 KOps/s 5.3907 KOps/s $\color{#35bf28}+0.70\%$
test_vmap_mlp_speed_decorator[True-True] 0.7819ms 0.6724ms 1.4872 KOps/s 1.4184 KOps/s $\color{#35bf28}+4.85\%$
test_vmap_mlp_speed_decorator[True-False] 0.8033ms 0.6761ms 1.4792 KOps/s 1.4087 KOps/s $\textbf{\color{#35bf28}+5.00\%}$
test_vmap_mlp_speed_decorator[False-True] 0.7103ms 0.5999ms 1.6669 KOps/s 1.6449 KOps/s $\color{#35bf28}+1.34\%$
test_vmap_mlp_speed_decorator[False-False] 0.7376ms 0.6014ms 1.6627 KOps/s 1.6289 KOps/s $\color{#35bf28}+2.07\%$
test_vmap_transformer_speed_decorator[True-True] 20.1489ms 19.2757ms 51.8788 Ops/s 52.8742 Ops/s $\color{#d91a1a}-1.88\%$
test_vmap_transformer_speed_decorator[True-False] 20.0146ms 19.2249ms 52.0158 Ops/s 52.8257 Ops/s $\color{#d91a1a}-1.53\%$
test_vmap_transformer_speed_decorator[False-True] 19.8753ms 18.8729ms 52.9860 Ops/s 53.5567 Ops/s $\color{#d91a1a}-1.07\%$
test_vmap_transformer_speed_decorator[False-False] 19.5490ms 18.7192ms 53.4212 Ops/s 53.2767 Ops/s $\color{#35bf28}+0.27\%$
test_to_module_speed[True] 1.0713ms 0.9781ms 1.0223 KOps/s 989.3886 Ops/s $\color{#35bf28}+3.33\%$
test_to_module_speed[False] 1.1040ms 0.9583ms 1.0435 KOps/s 1.0224 KOps/s $\color{#35bf28}+2.06\%$
test_tc_init 54.6310μs 34.1426μs 29.2889 KOps/s 24.8283 KOps/s $\textbf{\color{#35bf28}+17.97\%}$
test_tc_init_nested 0.1028ms 69.7310μs 14.3408 KOps/s 11.9598 KOps/s $\textbf{\color{#35bf28}+19.91\%}$
test_tc_first_layer_tensor 32.3100μs 0.8328μs 1.2008 MOps/s 1.1817 MOps/s $\color{#35bf28}+1.62\%$
test_tc_first_layer_nontensor 23.6700μs 2.2541μs 443.6354 KOps/s 437.5243 KOps/s $\color{#35bf28}+1.40\%$
test_tc_second_layer_tensor 22.5855μs 1.4045μs 711.9792 KOps/s 644.6392 KOps/s $\textbf{\color{#35bf28}+10.45\%}$
test_tc_second_layer_nontensor 38.8210μs 3.0115μs 332.0595 KOps/s 331.1266 KOps/s $\color{#35bf28}+0.28\%$
test_unbind 0.2195s 11.6751ms 85.6522 Ops/s 139.7307 Ops/s $\textbf{\color{#d91a1a}-38.70\%}$
test_full_like 9.4380ms 9.1173ms 109.6818 Ops/s 108.7795 Ops/s $\color{#35bf28}+0.83\%$
test_zeros_like 5.3904ms 4.3279ms 231.0610 Ops/s 234.0943 Ops/s $\color{#d91a1a}-1.30\%$
test_ones_like 4.9204ms 4.2151ms 237.2421 Ops/s 236.4597 Ops/s $\color{#35bf28}+0.33\%$
test_clone 6.5047ms 6.3529ms 157.4075 Ops/s 109.5662 Ops/s $\textbf{\color{#35bf28}+43.66\%}$
test_squeeze 59.2510μs 10.0455μs 99.5475 KOps/s 97.5498 KOps/s $\color{#35bf28}+2.05\%$
test_unsqueeze 0.1236ms 71.7382μs 13.9396 KOps/s 12.8491 KOps/s $\textbf{\color{#35bf28}+8.49\%}$
test_split 0.3952ms 0.1776ms 5.6293 KOps/s 5.7615 KOps/s $\color{#d91a1a}-2.29\%$
test_permute 0.2421ms 0.1851ms 5.4023 KOps/s 5.3762 KOps/s $\color{#35bf28}+0.49\%$
test_stack 50.6066ms 50.3276ms 19.8698 Ops/s 19.7511 Ops/s $\color{#35bf28}+0.60\%$
test_cat 50.5404ms 50.3180ms 19.8736 Ops/s 19.7697 Ops/s $\color{#35bf28}+0.53\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] TensorDict.to moves all tensors to cuda:0 if index not specified
2 participants