Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Consistent behavior for pad_sequence with one and many non-tensors #1172

Merged
merged 1 commit into from
Jan 9, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 9, 2025

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 9, 2025
vmoens added a commit that referenced this pull request Jan 9, 2025
…ensors

ghstack-source-id: c74edd95ed9846c14ffe26cb176d93c6e5e0dfbf
Pull Request resolved: #1172
@vmoens vmoens added the bug Something isn't working label Jan 9, 2025
@vmoens vmoens merged commit 5ac5688 into gh/vmoens/45/base Jan 9, 2025
1 check passed
vmoens added a commit that referenced this pull request Jan 9, 2025
…ensors

ghstack-source-id: c74edd95ed9846c14ffe26cb176d93c6e5e0dfbf
Pull Request resolved: #1172
@vmoens vmoens deleted the gh/vmoens/45/head branch January 9, 2025 13:23
Copy link

github-actions bot commented Jan 9, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 35.8970μs 19.2152μs 52.0420 KOps/s 51.0316 KOps/s $\color{#35bf28}+1.98\%$
test_plain_set_stack_nested 64.8710μs 19.3515μs 51.6755 KOps/s 49.7522 KOps/s $\color{#35bf28}+3.87\%$
test_plain_set_nested_inplace 53.4190μs 20.8530μs 47.9547 KOps/s 45.8312 KOps/s $\color{#35bf28}+4.63\%$
test_plain_set_stack_nested_inplace 77.3740μs 20.8877μs 47.8751 KOps/s 45.5990 KOps/s $\color{#35bf28}+4.99\%$
test_items 43.8320μs 4.1800μs 239.2365 KOps/s 235.8297 KOps/s $\color{#35bf28}+1.44\%$
test_items_nested 0.7216ms 0.3950ms 2.5316 KOps/s 2.5225 KOps/s $\color{#35bf28}+0.36\%$
test_items_nested_locked 0.5549ms 0.3943ms 2.5361 KOps/s 2.5093 KOps/s $\color{#35bf28}+1.06\%$
test_items_nested_leaf 0.1519ms 77.8615μs 12.8433 KOps/s 12.8560 KOps/s $\color{#d91a1a}-0.10\%$
test_items_stack_nested 0.7341ms 0.3912ms 2.5565 KOps/s 2.4845 KOps/s $\color{#35bf28}+2.90\%$
test_items_stack_nested_leaf 0.1347ms 78.9730μs 12.6626 KOps/s 12.5620 KOps/s $\color{#35bf28}+0.80\%$
test_items_stack_nested_locked 0.8378ms 0.3975ms 2.5157 KOps/s 2.4865 KOps/s $\color{#35bf28}+1.18\%$
test_keys 27.9520μs 3.4945μs 286.1677 KOps/s 293.0273 KOps/s $\color{#d91a1a}-2.34\%$
test_keys_nested 0.2390ms 0.1627ms 6.1458 KOps/s 6.0476 KOps/s $\color{#35bf28}+1.62\%$
test_keys_nested_locked 0.7261ms 0.1702ms 5.8754 KOps/s 5.8229 KOps/s $\color{#35bf28}+0.90\%$
test_keys_nested_leaf 0.2139ms 0.1426ms 7.0127 KOps/s 6.8959 KOps/s $\color{#35bf28}+1.69\%$
test_keys_stack_nested 0.2944ms 0.1650ms 6.0599 KOps/s 6.1360 KOps/s $\color{#d91a1a}-1.24\%$
test_keys_stack_nested_leaf 0.2427ms 0.1418ms 7.0503 KOps/s 6.9804 KOps/s $\color{#35bf28}+1.00\%$
test_keys_stack_nested_locked 0.2788ms 0.1689ms 5.9215 KOps/s 5.8579 KOps/s $\color{#35bf28}+1.09\%$
test_values 9.8244μs 1.0428μs 958.9325 KOps/s 956.4103 KOps/s $\color{#35bf28}+0.26\%$
test_values_nested 0.1150ms 59.8626μs 16.7049 KOps/s 16.2464 KOps/s $\color{#35bf28}+2.82\%$
test_values_nested_locked 0.1247ms 60.3220μs 16.5777 KOps/s 16.4210 KOps/s $\color{#35bf28}+0.95\%$
test_values_nested_leaf 0.1289ms 69.7454μs 14.3379 KOps/s 14.1896 KOps/s $\color{#35bf28}+1.05\%$
test_values_stack_nested 0.1326ms 60.6031μs 16.5008 KOps/s 15.2633 KOps/s $\textbf{\color{#35bf28}+8.11\%}$
test_values_stack_nested_leaf 0.1353ms 71.1770μs 14.0495 KOps/s 14.1245 KOps/s $\color{#d91a1a}-0.53\%$
test_values_stack_nested_locked 0.1168ms 60.6224μs 16.4956 KOps/s 15.9437 KOps/s $\color{#35bf28}+3.46\%$
test_membership 28.0320μs 0.8647μs 1.1565 MOps/s 1.1703 MOps/s $\color{#d91a1a}-1.18\%$
test_membership_nested 36.4990μs 2.9085μs 343.8206 KOps/s 347.4602 KOps/s $\color{#d91a1a}-1.05\%$
test_membership_nested_leaf 32.9820μs 2.9536μs 338.5669 KOps/s 336.3960 KOps/s $\color{#35bf28}+0.65\%$
test_membership_stacked_nested 31.2480μs 2.8849μs 346.6379 KOps/s 345.1599 KOps/s $\color{#35bf28}+0.43\%$
test_membership_stacked_nested_leaf 15.8500μs 2.8896μs 346.0724 KOps/s 346.8577 KOps/s $\color{#d91a1a}-0.23\%$
test_membership_nested_last 45.8430μs 4.2980μs 232.6677 KOps/s 231.4263 KOps/s $\color{#35bf28}+0.54\%$
test_membership_nested_leaf_last 30.3170μs 4.3496μs 229.9054 KOps/s 225.9162 KOps/s $\color{#35bf28}+1.77\%$
test_membership_stacked_nested_last 53.9210μs 4.3074μs 232.1588 KOps/s 230.5638 KOps/s $\color{#35bf28}+0.69\%$
test_membership_stacked_nested_leaf_last 30.2070μs 4.3201μs 231.4773 KOps/s 226.7878 KOps/s $\color{#35bf28}+2.07\%$
test_nested_getleaf 57.2170μs 10.6578μs 93.8279 KOps/s 94.8301 KOps/s $\color{#d91a1a}-1.06\%$
test_nested_get 60.0020μs 9.9762μs 100.2387 KOps/s 99.2791 KOps/s $\color{#35bf28}+0.97\%$
test_stacked_getleaf 43.2110μs 10.3342μs 96.7665 KOps/s 95.7179 KOps/s $\color{#35bf28}+1.10\%$
test_stacked_get 63.2280μs 9.9214μs 100.7922 KOps/s 99.5880 KOps/s $\color{#35bf28}+1.21\%$
test_nested_getitemleaf 41.8470μs 10.9617μs 91.2263 KOps/s 90.6333 KOps/s $\color{#35bf28}+0.65\%$
test_nested_getitem 69.3500μs 10.4185μs 95.9831 KOps/s 97.9867 KOps/s $\color{#d91a1a}-2.04\%$
test_stacked_getitemleaf 67.1150μs 10.7862μs 92.7112 KOps/s 90.7959 KOps/s $\color{#35bf28}+2.11\%$
test_stacked_getitem 40.4860μs 10.0585μs 99.4179 KOps/s 97.6923 KOps/s $\color{#35bf28}+1.77\%$
test_lock_nested 0.9609ms 0.4619ms 2.1650 KOps/s 2.1871 KOps/s $\color{#d91a1a}-1.01\%$
test_lock_stack_nested 0.8620ms 0.4309ms 2.3205 KOps/s 2.3661 KOps/s $\color{#d91a1a}-1.93\%$
test_unlock_nested 0.8442ms 0.3809ms 2.6254 KOps/s 2.6043 KOps/s $\color{#35bf28}+0.81\%$
test_unlock_stack_nested 0.5232ms 0.3485ms 2.8697 KOps/s 2.9081 KOps/s $\color{#d91a1a}-1.32\%$
test_flatten_speed 0.2173ms 0.1025ms 9.7532 KOps/s 10.0565 KOps/s $\color{#d91a1a}-3.02\%$
test_unflatten_speed 0.9023ms 0.5244ms 1.9068 KOps/s 1.9174 KOps/s $\color{#d91a1a}-0.55\%$
test_common_ops 1.9355ms 0.7648ms 1.3076 KOps/s 1.3202 KOps/s $\color{#d91a1a}-0.95\%$
test_creation 46.1960μs 2.4559μs 407.1884 KOps/s 394.7051 KOps/s $\color{#35bf28}+3.16\%$
test_creation_empty 48.1390μs 9.6108μs 104.0498 KOps/s 98.3702 KOps/s $\textbf{\color{#35bf28}+5.77\%}$
test_creation_nested_1 92.9210μs 12.6055μs 79.3302 KOps/s 77.7201 KOps/s $\color{#35bf28}+2.07\%$
test_creation_nested_2 88.1160μs 16.8355μs 59.3984 KOps/s 57.1656 KOps/s $\color{#35bf28}+3.91\%$
test_clone 0.1567ms 14.0855μs 70.9950 KOps/s 72.9831 KOps/s $\color{#d91a1a}-2.72\%$
test_getitem[int] 1.3004ms 12.8202μs 78.0017 KOps/s 76.5611 KOps/s $\color{#35bf28}+1.88\%$
test_getitem[slice_int] 0.1605ms 25.5150μs 39.1926 KOps/s 39.8180 KOps/s $\color{#d91a1a}-1.57\%$
test_getitem[range] 0.3550ms 50.0604μs 19.9759 KOps/s 20.7023 KOps/s $\color{#d91a1a}-3.51\%$
test_getitem[tuple] 0.1312ms 20.4108μs 48.9937 KOps/s 47.8576 KOps/s $\color{#35bf28}+2.37\%$
test_getitem[list] 0.3443ms 44.5461μs 22.4486 KOps/s 22.4045 KOps/s $\color{#35bf28}+0.20\%$
test_setitem_dim[int] 53.9310μs 24.4155μs 40.9575 KOps/s 39.7683 KOps/s $\color{#35bf28}+2.99\%$
test_setitem_dim[slice_int] 86.4610μs 51.6147μs 19.3743 KOps/s 19.3600 KOps/s $\color{#35bf28}+0.07\%$
test_setitem_dim[range] 0.1219ms 73.2414μs 13.6535 KOps/s 13.6013 KOps/s $\color{#35bf28}+0.38\%$
test_setitem_dim[tuple] 63.2680μs 39.9672μs 25.0205 KOps/s 25.1850 KOps/s $\color{#d91a1a}-0.65\%$
test_setitem 0.1643ms 20.0143μs 49.9642 KOps/s 52.0218 KOps/s $\color{#d91a1a}-3.96\%$
test_set 0.1628ms 19.0000μs 52.6315 KOps/s 52.6995 KOps/s $\color{#d91a1a}-0.13\%$
test_set_shared 3.6222ms 0.1759ms 5.6838 KOps/s 5.7795 KOps/s $\color{#d91a1a}-1.66\%$
test_update 0.1678ms 20.7715μs 48.1428 KOps/s 46.5827 KOps/s $\color{#35bf28}+3.35\%$
test_update_nested 0.2129ms 31.2013μs 32.0499 KOps/s 31.4595 KOps/s $\color{#35bf28}+1.88\%$
test_update__nested 0.8338ms 34.6063μs 28.8965 KOps/s 29.5682 KOps/s $\color{#d91a1a}-2.27\%$
test_set_nested 0.1965ms 21.3447μs 46.8500 KOps/s 46.9256 KOps/s $\color{#d91a1a}-0.16\%$
test_set_nested_new 0.1684ms 26.1387μs 38.2574 KOps/s 38.8022 KOps/s $\color{#d91a1a}-1.40\%$
test_select 0.2037ms 42.9722μs 23.2708 KOps/s 23.5449 KOps/s $\color{#d91a1a}-1.16\%$
test_select_nested 0.1209ms 62.6549μs 15.9604 KOps/s 15.8774 KOps/s $\color{#35bf28}+0.52\%$
test_exclude_nested 0.1592ms 82.1182μs 12.1776 KOps/s 12.2898 KOps/s $\color{#d91a1a}-0.91\%$
test_empty[True] 0.7473ms 0.4081ms 2.4501 KOps/s 2.4565 KOps/s $\color{#d91a1a}-0.26\%$
test_empty[False] 14.2140μs 1.3798μs 724.7304 KOps/s 724.5518 KOps/s $\color{#35bf28}+0.02\%$
test_unbind_speed 0.4113ms 0.2702ms 3.7009 KOps/s 3.6583 KOps/s $\color{#35bf28}+1.16\%$
test_unbind_speed_stack0 0.5283ms 0.2685ms 3.7243 KOps/s 3.7357 KOps/s $\color{#d91a1a}-0.30\%$
test_unbind_speed_stack1 0.1130s 0.8068ms 1.2394 KOps/s 1.3607 KOps/s $\textbf{\color{#d91a1a}-8.91\%}$
test_split 0.1180s 1.8169ms 550.3866 Ops/s 549.9266 Ops/s $\color{#35bf28}+0.08\%$
test_chunk 1.8613ms 1.6450ms 607.9085 Ops/s 550.8887 Ops/s $\textbf{\color{#35bf28}+10.35\%}$
test_consolidate_njt[False-None] 0.1208s 9.1832ms 108.8940 Ops/s 118.8871 Ops/s $\textbf{\color{#d91a1a}-8.41\%}$
test_creation[device0] 0.3006ms 92.3630μs 10.8268 KOps/s 10.7478 KOps/s $\color{#35bf28}+0.74\%$
test_creation_from_tensor 4.9899ms 95.5941μs 10.4609 KOps/s 10.3381 KOps/s $\color{#35bf28}+1.19\%$
test_add_one[memmap_tensor0] 0.4806ms 5.1232μs 195.1910 KOps/s 202.7895 KOps/s $\color{#d91a1a}-3.75\%$
test_contiguous[memmap_tensor0] 28.1120μs 0.5138μs 1.9462 MOps/s 1.9694 MOps/s $\color{#d91a1a}-1.18\%$
test_stack[memmap_tensor0] 50.6340μs 3.5635μs 280.6247 KOps/s 284.3704 KOps/s $\color{#d91a1a}-1.32\%$
test_memmaptd_index 1.1003ms 0.2383ms 4.1959 KOps/s 4.0600 KOps/s $\color{#35bf28}+3.35\%$
test_memmaptd_index_astensor 0.6010ms 0.3256ms 3.0716 KOps/s 3.0171 KOps/s $\color{#35bf28}+1.81\%$
test_memmaptd_index_op 1.0524ms 0.5686ms 1.7586 KOps/s 1.7470 KOps/s $\color{#35bf28}+0.67\%$
test_serialize_model 0.1276s 0.1187s 8.4220 Ops/s 8.4017 Ops/s $\color{#35bf28}+0.24\%$
test_serialize_model_pickle 0.4506s 0.3940s 2.5383 Ops/s 2.4769 Ops/s $\color{#35bf28}+2.48\%$
test_serialize_weights 0.1228s 0.1178s 8.4871 Ops/s 7.5336 Ops/s $\textbf{\color{#35bf28}+12.66\%}$
test_serialize_weights_returnearly 0.1737s 0.1606s 6.2278 Ops/s 6.3766 Ops/s $\color{#d91a1a}-2.33\%$
test_serialize_weights_pickle 0.4658s 0.4101s 2.4385 Ops/s 2.5475 Ops/s $\color{#d91a1a}-4.28\%$
test_serialize_weights_filesystem 0.1567s 0.1459s 6.8528 Ops/s 6.8791 Ops/s $\color{#d91a1a}-0.38\%$
test_serialize_model_filesystem 0.1610s 0.1483s 6.7427 Ops/s 5.7716 Ops/s $\textbf{\color{#35bf28}+16.83\%}$
test_reshape_pytree 82.9550μs 27.1189μs 36.8746 KOps/s 35.5414 KOps/s $\color{#35bf28}+3.75\%$
test_reshape_td 0.1143ms 33.7350μs 29.6428 KOps/s 29.6952 KOps/s $\color{#d91a1a}-0.18\%$
test_view_pytree 68.8480μs 26.3047μs 38.0161 KOps/s 37.3338 KOps/s $\color{#35bf28}+1.83\%$
test_view_td 97.1810μs 37.6882μs 26.5335 KOps/s 26.2484 KOps/s $\color{#35bf28}+1.09\%$
test_unbind_pytree 0.1337ms 29.3525μs 34.0686 KOps/s 33.5973 KOps/s $\color{#35bf28}+1.40\%$
test_unbind_td 0.2998ms 39.5269μs 25.2992 KOps/s 24.0872 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_split_pytree 76.5730μs 29.9836μs 33.3516 KOps/s 32.6023 KOps/s $\color{#35bf28}+2.30\%$
test_split_td 0.2096ms 46.0725μs 21.7049 KOps/s 21.7644 KOps/s $\color{#d91a1a}-0.27\%$
test_add_pytree 91.7410μs 35.0414μs 28.5377 KOps/s 28.4649 KOps/s $\color{#35bf28}+0.26\%$
test_add_td 0.1197ms 51.9781μs 19.2389 KOps/s 18.1518 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_compile_add_one_nested[tensordict-compile] 0.1427ms 63.9311μs 15.6418 KOps/s 15.4302 KOps/s $\color{#35bf28}+1.37\%$
test_compile_add_one_nested[tensordict-eager] 0.3951ms 0.1725ms 5.7973 KOps/s 5.8903 KOps/s $\color{#d91a1a}-1.58\%$
test_compile_add_one_nested[pytree-compile] 0.1163ms 46.7561μs 21.3876 KOps/s 21.4626 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_add_one_nested[pytree-eager] 0.2781ms 0.1211ms 8.2590 KOps/s 8.5132 KOps/s $\color{#d91a1a}-2.99\%$
test_compile_copy_nested[tensordict-compile] 69.3600μs 26.6830μs 37.4771 KOps/s 37.2219 KOps/s $\color{#35bf28}+0.69\%$
test_compile_copy_nested[tensordict-eager] 0.1588ms 58.9291μs 16.9695 KOps/s 16.9589 KOps/s $\color{#35bf28}+0.06\%$
test_compile_copy_nested[pytree-compile] 0.1653ms 77.5039μs 12.9026 KOps/s 12.7996 KOps/s $\color{#35bf28}+0.80\%$
test_compile_copy_nested[pytree-eager] 0.1392ms 66.6828μs 14.9964 KOps/s 15.0177 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_flat[tensordict-compile] 0.2057ms 0.1056ms 9.4685 KOps/s 9.5696 KOps/s $\color{#d91a1a}-1.06\%$
test_compile_add_one_flat[tensordict-eager] 0.4523ms 0.2151ms 4.6490 KOps/s 4.6489 KOps/s $+0.00\%$
test_compile_add_one_flat[tensorclass-compile] 0.1105ms 45.8140μs 21.8274 KOps/s 21.3218 KOps/s $\color{#35bf28}+2.37\%$
test_compile_add_one_flat[tensorclass-eager] 0.4830ms 65.8749μs 15.1803 KOps/s 14.7201 KOps/s $\color{#35bf28}+3.13\%$
test_compile_add_one_flat[pytree-compile] 0.2305ms 0.1037ms 9.6428 KOps/s 9.7033 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_add_one_flat[pytree-eager] 0.4559ms 0.2039ms 4.9055 KOps/s 4.9006 KOps/s $\color{#35bf28}+0.10\%$
test_compile_add_self_flat[tensordict-eager] 0.4371ms 0.2309ms 4.3318 KOps/s 4.3001 KOps/s $\color{#35bf28}+0.74\%$
test_compile_add_self_flat[tensordict-compile] 0.2156ms 0.1053ms 9.4923 KOps/s 9.5090 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_add_self_flat[tensorclass-eager] 0.1812ms 68.4727μs 14.6044 KOps/s 15.3914 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_compile_add_self_flat[tensorclass-compile] 0.1116ms 48.5733μs 20.5875 KOps/s 20.6031 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_add_self_flat[pytree-eager] 0.3226ms 0.1580ms 6.3292 KOps/s 6.2630 KOps/s $\color{#35bf28}+1.06\%$
test_compile_add_self_flat[pytree-compile] 0.1835ms 0.1049ms 9.5311 KOps/s 9.6944 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_copy_flat[tensordict-compile] 66.7140μs 21.8504μs 45.7657 KOps/s 46.5514 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_copy_flat[tensordict-eager] 0.1863ms 68.1049μs 14.6832 KOps/s 15.0520 KOps/s $\color{#d91a1a}-2.45\%$
test_compile_copy_flat[pytree-compile] 0.1561ms 78.5271μs 12.7345 KOps/s 12.1371 KOps/s $\color{#35bf28}+4.92\%$
test_compile_copy_flat[pytree-eager] 0.1270ms 67.2880μs 14.8615 KOps/s 14.7081 KOps/s $\color{#35bf28}+1.04\%$
test_compile_assign_and_add[tensordict-compile] 0.3952ms 0.2068ms 4.8354 KOps/s 4.8203 KOps/s $\color{#35bf28}+0.31\%$
test_compile_assign_and_add[tensordict-eager] 1.5736ms 1.3274ms 753.3307 Ops/s 770.1640 Ops/s $\color{#d91a1a}-2.19\%$
test_compile_assign_and_add[pytree-compile] 0.2894ms 0.2023ms 4.9426 KOps/s 4.9639 KOps/s $\color{#d91a1a}-0.43\%$
test_compile_assign_and_add[pytree-eager] 1.4272ms 0.7890ms 1.2674 KOps/s 1.2882 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_assign_and_add_stack[compile] 0.6741ms 0.4498ms 2.2231 KOps/s 2.2283 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_assign_and_add_stack[eager] 2.8246ms 2.5675ms 389.4880 Ops/s 376.0270 Ops/s $\color{#35bf28}+3.58\%$
test_compile_indexing[tensor-tensordict-compile] 99.2450μs 36.4206μs 27.4570 KOps/s 26.6802 KOps/s $\color{#35bf28}+2.91\%$
test_compile_indexing[tensor-tensordict-eager] 0.5471ms 33.5497μs 29.8065 KOps/s 19.9655 KOps/s $\textbf{\color{#35bf28}+49.29\%}$
test_compile_indexing[tensor-tensorclass-compile] 98.1040μs 29.3943μs 34.0202 KOps/s 32.8909 KOps/s $\color{#35bf28}+3.43\%$
test_compile_indexing[tensor-tensorclass-eager] 64.9810μs 23.2803μs 42.9548 KOps/s 43.3159 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_indexing[tensor-pytree-compile] 80.6700μs 30.4827μs 32.8055 KOps/s 31.1862 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_compile_indexing[tensor-pytree-eager] 71.9940μs 23.0154μs 43.4491 KOps/s 43.7044 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_indexing[slice-tensordict-compile] 0.1165ms 53.5565μs 18.6719 KOps/s 18.7737 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_indexing[slice-tensordict-eager] 0.5461ms 20.4321μs 48.9425 KOps/s 48.4636 KOps/s $\color{#35bf28}+0.99\%$
test_compile_indexing[slice-tensorclass-compile] 0.1229ms 45.8028μs 21.8327 KOps/s 21.5894 KOps/s $\color{#35bf28}+1.13\%$
test_compile_indexing[slice-tensorclass-eager] 0.1009ms 18.4169μs 54.2979 KOps/s 52.8468 KOps/s $\color{#35bf28}+2.75\%$
test_compile_indexing[slice-pytree-compile] 0.1274ms 46.0485μs 21.7162 KOps/s 21.6110 KOps/s $\color{#35bf28}+0.49\%$
test_compile_indexing[slice-pytree-eager] 64.5200μs 18.5391μs 53.9402 KOps/s 53.1209 KOps/s $\color{#35bf28}+1.54\%$
test_compile_indexing[int-tensordict-compile] 0.1253ms 54.9443μs 18.2002 KOps/s 18.1334 KOps/s $\color{#35bf28}+0.37\%$
test_compile_indexing[int-tensordict-eager] 0.9674ms 20.7563μs 48.1781 KOps/s 48.8456 KOps/s $\color{#d91a1a}-1.37\%$
test_compile_indexing[int-tensorclass-compile] 0.1154ms 46.3010μs 21.5978 KOps/s 21.5165 KOps/s $\color{#35bf28}+0.38\%$
test_compile_indexing[int-tensorclass-eager] 75.9410μs 18.6231μs 53.6969 KOps/s 51.9327 KOps/s $\color{#35bf28}+3.40\%$
test_compile_indexing[int-pytree-compile] 0.1594ms 45.9803μs 21.7484 KOps/s 21.5485 KOps/s $\color{#35bf28}+0.93\%$
test_compile_indexing[int-pytree-eager] 0.4666ms 18.6643μs 53.5781 KOps/s 53.3728 KOps/s $\color{#35bf28}+0.38\%$
test_mod_add[eager] 0.1060ms 33.1114μs 30.2010 KOps/s 30.2504 KOps/s $\color{#d91a1a}-0.16\%$
test_mod_add[compile] 0.1246ms 47.5639μs 21.0244 KOps/s 20.4335 KOps/s $\color{#35bf28}+2.89\%$
test_mod_add[compile-overhead] 0.1159ms 48.0872μs 20.7956 KOps/s 20.1383 KOps/s $\color{#35bf28}+3.26\%$
test_mod_wrap[eager] 0.3898ms 0.2184ms 4.5777 KOps/s 4.4456 KOps/s $\color{#35bf28}+2.97\%$
test_mod_wrap[compile] 0.3571ms 0.2052ms 4.8730 KOps/s 4.7115 KOps/s $\color{#35bf28}+3.43\%$
test_mod_wrap[compile-overhead] 0.3905ms 0.2058ms 4.8585 KOps/s 4.7323 KOps/s $\color{#35bf28}+2.67\%$
test_mod_wrap_and_backward[eager] 18.7614ms 12.3832ms 80.7548 Ops/s 89.1418 Ops/s $\textbf{\color{#d91a1a}-9.41\%}$
test_mod_wrap_and_backward[compile] 19.6880ms 13.2030ms 75.7404 Ops/s 91.5235 Ops/s $\textbf{\color{#d91a1a}-17.24\%}$
test_mod_wrap_and_backward[compile-overhead] 32.8897ms 14.6332ms 68.3376 Ops/s 90.7851 Ops/s $\textbf{\color{#d91a1a}-24.73\%}$
test_seq_add[eager] 0.2653ms 0.1157ms 8.6462 KOps/s 8.8955 KOps/s $\color{#d91a1a}-2.80\%$
test_seq_add[compile] 0.1473ms 64.2030μs 15.5756 KOps/s 15.9545 KOps/s $\color{#d91a1a}-2.37\%$
test_seq_add[compile-overhead] 0.1402ms 62.7606μs 15.9336 KOps/s 16.2125 KOps/s $\color{#d91a1a}-1.72\%$
test_seq_wrap[eager] 0.8292ms 0.4355ms 2.2963 KOps/s 2.3047 KOps/s $\color{#d91a1a}-0.36\%$
test_seq_wrap[compile] 0.3803ms 0.2360ms 4.2364 KOps/s 4.3707 KOps/s $\color{#d91a1a}-3.07\%$
test_seq_wrap[compile-overhead] 0.3280ms 0.2321ms 4.3078 KOps/s 4.3553 KOps/s $\color{#d91a1a}-1.09\%$
test_func_call_runtime[False-eager] 0.8185ms 0.5368ms 1.8629 KOps/s 1.8297 KOps/s $\color{#35bf28}+1.82\%$
test_func_call_runtime[False-compile] 1.1707ms 0.4470ms 2.2370 KOps/s 2.3095 KOps/s $\color{#d91a1a}-3.14\%$
test_func_call_runtime[False-compile-overhead] 0.5697ms 0.4324ms 2.3129 KOps/s 2.3704 KOps/s $\color{#d91a1a}-2.43\%$
test_func_call_runtime[True-eager] 0.9393ms 0.7570ms 1.3210 KOps/s 1.3250 KOps/s $\color{#d91a1a}-0.31\%$
test_func_call_runtime[True-compile] 1.1104ms 0.4798ms 2.0842 KOps/s 2.1578 KOps/s $\color{#d91a1a}-3.41\%$
test_func_call_runtime[True-compile-overhead] 0.6037ms 0.4726ms 2.1161 KOps/s 2.1532 KOps/s $\color{#d91a1a}-1.73\%$
test_func_call_cm_runtime[False-eager] 0.8614ms 0.5389ms 1.8557 KOps/s 1.8196 KOps/s $\color{#35bf28}+1.98\%$
test_func_call_cm_runtime[False-compile] 0.5710ms 0.4297ms 2.3271 KOps/s 2.3665 KOps/s $\color{#d91a1a}-1.67\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6586ms 0.4312ms 2.3193 KOps/s 2.3730 KOps/s $\color{#d91a1a}-2.26\%$
test_func_call_cm_runtime[True-eager] 1.8549ms 0.9123ms 1.0961 KOps/s 1.1153 KOps/s $\color{#d91a1a}-1.73\%$
test_func_call_cm_runtime[True-compile] 0.6230ms 0.4913ms 2.0352 KOps/s 2.0275 KOps/s $\color{#35bf28}+0.38\%$
test_func_call_cm_runtime[True-compile-overhead] 0.7629ms 0.4935ms 2.0263 KOps/s 2.0186 KOps/s $\color{#35bf28}+0.38\%$
test_vmap_func_call_cm_runtime[eager] 2.4846ms 1.9114ms 523.1887 Ops/s 518.7763 Ops/s $\color{#35bf28}+0.85\%$
test_vmap_func_call_cm_runtime[compile] 0.8113ms 0.5187ms 1.9281 KOps/s 1.9072 KOps/s $\color{#35bf28}+1.09\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7556ms 0.5203ms 1.9220 KOps/s 1.8951 KOps/s $\color{#35bf28}+1.42\%$
test_distributed 0.2965ms 0.1275ms 7.8425 KOps/s 7.5524 KOps/s $\color{#35bf28}+3.84\%$
test_tdmodule 0.1066ms 25.8931μs 38.6203 KOps/s 40.1292 KOps/s $\color{#d91a1a}-3.76\%$
test_tdmodule_dispatch 66.0230μs 44.8675μs 22.2878 KOps/s 22.1545 KOps/s $\color{#35bf28}+0.60\%$
test_tdseq 89.6770μs 27.3578μs 36.5527 KOps/s 36.4016 KOps/s $\color{#35bf28}+0.41\%$
test_tdseq_dispatch 73.1260μs 50.7502μs 19.7044 KOps/s 19.7891 KOps/s $\color{#d91a1a}-0.43\%$
test_instantiation_functorch 2.0496ms 1.5278ms 654.5528 Ops/s 647.6301 Ops/s $\color{#35bf28}+1.07\%$
test_exec_functorch 0.3088ms 0.1837ms 5.4443 KOps/s 5.5383 KOps/s $\color{#d91a1a}-1.70\%$
test_exec_functional_call 0.3283ms 0.1696ms 5.8950 KOps/s 5.7831 KOps/s $\color{#35bf28}+1.94\%$
test_exec_td_decorator 0.4835ms 0.2314ms 4.3219 KOps/s 4.3544 KOps/s $\color{#d91a1a}-0.75\%$
test_vmap_mlp_speed_decorator[True-True] 0.9365ms 0.6477ms 1.5439 KOps/s 1.5631 KOps/s $\color{#d91a1a}-1.22\%$
test_vmap_mlp_speed_decorator[True-False] 0.8869ms 0.6481ms 1.5429 KOps/s 1.5219 KOps/s $\color{#35bf28}+1.38\%$
test_vmap_mlp_speed_decorator[False-True] 0.9528ms 0.5315ms 1.8815 KOps/s 1.9026 KOps/s $\color{#d91a1a}-1.11\%$
test_vmap_mlp_speed_decorator[False-False] 0.8375ms 0.5266ms 1.8990 KOps/s 1.8994 KOps/s $\color{#d91a1a}-0.02\%$
test_to_module_speed[True] 1.7222ms 1.3022ms 767.9307 Ops/s 762.1016 Ops/s $\color{#35bf28}+0.76\%$
test_to_module_speed[False] 2.1433ms 1.3091ms 763.8774 Ops/s 774.7020 Ops/s $\color{#d91a1a}-1.40\%$
test_tc_init 76.9430μs 44.6756μs 22.3836 KOps/s 23.0957 KOps/s $\color{#d91a1a}-3.08\%$
test_tc_init_nested 0.1454ms 88.0696μs 11.3547 KOps/s 11.7584 KOps/s $\color{#d91a1a}-3.43\%$
test_tc_first_layer_tensor 26.2990μs 1.5299μs 653.6499 KOps/s 622.6048 KOps/s $\color{#35bf28}+4.99\%$
test_tc_first_layer_nontensor 44.6140μs 4.8414μs 206.5514 KOps/s 208.6584 KOps/s $\color{#d91a1a}-1.01\%$
test_tc_second_layer_tensor 26.5600μs 2.8651μs 349.0242 KOps/s 335.2882 KOps/s $\color{#35bf28}+4.10\%$
test_tc_second_layer_nontensor 51.0850μs 6.2551μs 159.8684 KOps/s 163.1778 KOps/s $\color{#d91a1a}-2.03\%$
test_unbind 0.2239s 13.2090ms 75.7059 Ops/s 72.3091 Ops/s $\color{#35bf28}+4.70\%$
test_full_like 17.5864ms 11.9493ms 83.6870 Ops/s 107.1523 Ops/s $\textbf{\color{#d91a1a}-21.90\%}$
test_zeros_like 13.7941ms 7.3238ms 136.5420 Ops/s 305.3244 Ops/s $\textbf{\color{#d91a1a}-55.28\%}$
test_ones_like 11.9725ms 7.9642ms 125.5623 Ops/s 245.7678 Ops/s $\textbf{\color{#d91a1a}-48.91\%}$
test_clone 16.6064ms 9.9518ms 100.4839 Ops/s 173.1246 Ops/s $\textbf{\color{#d91a1a}-41.96\%}$
test_squeeze 60.3430μs 12.0382μs 83.0692 KOps/s 81.6873 KOps/s $\color{#35bf28}+1.69\%$
test_unsqueeze 0.1564ms 91.7524μs 10.8989 KOps/s 10.6202 KOps/s $\color{#35bf28}+2.62\%$
test_split 0.4754ms 0.2006ms 4.9848 KOps/s 5.0834 KOps/s $\color{#d91a1a}-1.94\%$
test_permute 0.3376ms 0.2063ms 4.8474 KOps/s 4.8665 KOps/s $\color{#d91a1a}-0.39\%$
test_stack 28.2702ms 25.5462ms 39.1447 Ops/s 37.0507 Ops/s $\textbf{\color{#35bf28}+5.65\%}$
test_cat 33.5946ms 25.6949ms 38.9183 Ops/s 37.7872 Ops/s $\color{#35bf28}+2.99\%$

Copy link

github-actions bot commented Jan 9, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.0010μs 12.5188μs 79.8801 KOps/s 80.0329 KOps/s $\color{#d91a1a}-0.19\%$
test_plain_set_stack_nested 41.5300μs 12.5566μs 79.6397 KOps/s 79.0067 KOps/s $\color{#35bf28}+0.80\%$
test_plain_set_nested_inplace 58.6810μs 13.6018μs 73.5195 KOps/s 72.8390 KOps/s $\color{#35bf28}+0.93\%$
test_plain_set_stack_nested_inplace 47.1010μs 13.5477μs 73.8132 KOps/s 72.7388 KOps/s $\color{#35bf28}+1.48\%$
test_items 27.7600μs 2.8990μs 344.9478 KOps/s 343.9594 KOps/s $\color{#35bf28}+0.29\%$
test_items_nested 0.3996ms 0.3688ms 2.7116 KOps/s 2.7296 KOps/s $\color{#d91a1a}-0.66\%$
test_items_nested_locked 0.4094ms 0.3685ms 2.7137 KOps/s 2.7247 KOps/s $\color{#d91a1a}-0.40\%$
test_items_nested_leaf 80.5710μs 57.9403μs 17.2592 KOps/s 17.1785 KOps/s $\color{#35bf28}+0.47\%$
test_items_stack_nested 0.4046ms 0.3655ms 2.7363 KOps/s 2.7583 KOps/s $\color{#d91a1a}-0.80\%$
test_items_stack_nested_leaf 99.3010μs 58.4658μs 17.1040 KOps/s 16.7118 KOps/s $\color{#35bf28}+2.35\%$
test_items_stack_nested_locked 0.3923ms 0.3643ms 2.7449 KOps/s 2.7560 KOps/s $\color{#d91a1a}-0.40\%$
test_keys 27.5300μs 3.4394μs 290.7523 KOps/s 292.9517 KOps/s $\color{#d91a1a}-0.75\%$
test_keys_nested 0.1235ms 81.4099μs 12.2835 KOps/s 12.3109 KOps/s $\color{#d91a1a}-0.22\%$
test_keys_nested_locked 0.8126ms 87.8899μs 11.3779 KOps/s 11.4434 KOps/s $\color{#d91a1a}-0.57\%$
test_keys_nested_leaf 0.1079ms 72.8155μs 13.7333 KOps/s 13.7388 KOps/s $\color{#d91a1a}-0.04\%$
test_keys_stack_nested 0.1334ms 81.6808μs 12.2428 KOps/s 11.9294 KOps/s $\color{#35bf28}+2.63\%$
test_keys_stack_nested_leaf 0.1013ms 73.4718μs 13.6107 KOps/s 13.4851 KOps/s $\color{#35bf28}+0.93\%$
test_keys_stack_nested_locked 0.1285ms 87.3851μs 11.4436 KOps/s 11.2448 KOps/s $\color{#35bf28}+1.77\%$
test_values 7.1633μs 0.8476μs 1.1798 MOps/s 1.1700 MOps/s $\color{#35bf28}+0.84\%$
test_values_nested 57.3810μs 34.7109μs 28.8094 KOps/s 29.4054 KOps/s $\color{#d91a1a}-2.03\%$
test_values_nested_locked 72.5910μs 36.6211μs 27.3066 KOps/s 27.6969 KOps/s $\color{#d91a1a}-1.41\%$
test_values_nested_leaf 64.6110μs 39.8739μs 25.0791 KOps/s 25.5029 KOps/s $\color{#d91a1a}-1.66\%$
test_values_stack_nested 63.1210μs 34.7498μs 28.7771 KOps/s 28.7484 KOps/s $\color{#35bf28}+0.10\%$
test_values_stack_nested_leaf 65.0610μs 40.3667μs 24.7729 KOps/s 25.1811 KOps/s $\color{#d91a1a}-1.62\%$
test_values_stack_nested_locked 78.5900μs 36.4779μs 27.4139 KOps/s 27.3318 KOps/s $\color{#35bf28}+0.30\%$
test_membership 2.0256μs 0.5003μs 1.9989 MOps/s 1.9700 MOps/s $\color{#35bf28}+1.47\%$
test_membership_nested 15.4205μs 1.9540μs 511.7581 KOps/s 505.3811 KOps/s $\color{#35bf28}+1.26\%$
test_membership_nested_leaf 20.3650μs 1.9827μs 504.3511 KOps/s 516.8094 KOps/s $\color{#d91a1a}-2.41\%$
test_membership_stacked_nested 21.7600μs 2.0349μs 491.4253 KOps/s 497.3713 KOps/s $\color{#d91a1a}-1.20\%$
test_membership_stacked_nested_leaf 35.6110μs 2.0236μs 494.1697 KOps/s 489.9206 KOps/s $\color{#35bf28}+0.87\%$
test_membership_nested_last 29.5810μs 3.0407μs 328.8701 KOps/s 328.5347 KOps/s $\color{#35bf28}+0.10\%$
test_membership_nested_leaf_last 28.5100μs 3.0561μs 327.2150 KOps/s 326.2976 KOps/s $\color{#35bf28}+0.28\%$
test_membership_stacked_nested_last 24.4410μs 3.0518μs 327.6714 KOps/s 284.4344 KOps/s $\textbf{\color{#35bf28}+15.20\%}$
test_membership_stacked_nested_leaf_last 23.3000μs 3.0357μs 329.4179 KOps/s 280.9972 KOps/s $\textbf{\color{#35bf28}+17.23\%}$
test_nested_getleaf 31.9410μs 6.1950μs 161.4198 KOps/s 160.7125 KOps/s $\color{#35bf28}+0.44\%$
test_nested_get 28.7500μs 5.8798μs 170.0734 KOps/s 166.9513 KOps/s $\color{#35bf28}+1.87\%$
test_stacked_getleaf 28.0900μs 6.1740μs 161.9707 KOps/s 161.9361 KOps/s $\color{#35bf28}+0.02\%$
test_stacked_get 34.9200μs 5.8458μs 171.0624 KOps/s 168.9279 KOps/s $\color{#35bf28}+1.26\%$
test_nested_getitemleaf 25.7800μs 6.2228μs 160.7005 KOps/s 156.0294 KOps/s $\color{#35bf28}+2.99\%$
test_nested_getitem 29.9100μs 5.9378μs 168.4119 KOps/s 162.6852 KOps/s $\color{#35bf28}+3.52\%$
test_stacked_getitemleaf 25.8000μs 6.2487μs 160.0342 KOps/s 159.6150 KOps/s $\color{#35bf28}+0.26\%$
test_stacked_getitem 33.2800μs 5.9520μs 168.0111 KOps/s 167.8567 KOps/s $\color{#35bf28}+0.09\%$
test_lock_nested 0.6962ms 0.3745ms 2.6705 KOps/s 2.7282 KOps/s $\color{#d91a1a}-2.11\%$
test_lock_stack_nested 0.4291ms 0.3398ms 2.9433 KOps/s 2.9261 KOps/s $\color{#35bf28}+0.59\%$
test_unlock_nested 0.7327ms 0.3099ms 3.2269 KOps/s 3.2000 KOps/s $\color{#35bf28}+0.84\%$
test_unlock_stack_nested 0.3148ms 0.2785ms 3.5908 KOps/s 3.5847 KOps/s $\color{#35bf28}+0.17\%$
test_flatten_speed 0.1179ms 76.4836μs 13.0747 KOps/s 13.4873 KOps/s $\color{#d91a1a}-3.06\%$
test_unflatten_speed 0.3796ms 0.3218ms 3.1077 KOps/s 3.0714 KOps/s $\color{#35bf28}+1.18\%$
test_common_ops 92.7383ms 0.6916ms 1.4460 KOps/s 1.6051 KOps/s $\textbf{\color{#d91a1a}-9.92\%}$
test_creation 0.1021ms 1.7092μs 585.0570 KOps/s 586.2526 KOps/s $\color{#d91a1a}-0.20\%$
test_creation_empty 41.4800μs 8.3503μs 119.7569 KOps/s 114.9500 KOps/s $\color{#35bf28}+4.18\%$
test_creation_nested_1 38.8910μs 10.0071μs 99.9289 KOps/s 97.3676 KOps/s $\color{#35bf28}+2.63\%$
test_creation_nested_2 35.9410μs 12.7866μs 78.2069 KOps/s 76.7018 KOps/s $\color{#35bf28}+1.96\%$
test_clone 84.9310μs 10.4684μs 95.5259 KOps/s 93.0631 KOps/s $\color{#35bf28}+2.65\%$
test_getitem[int] 1.3687ms 10.3089μs 97.0035 KOps/s 95.0068 KOps/s $\color{#35bf28}+2.10\%$
test_getitem[slice_int] 0.1059ms 20.1426μs 49.6459 KOps/s 48.9437 KOps/s $\color{#35bf28}+1.43\%$
test_getitem[range] 0.1290ms 36.0333μs 27.7521 KOps/s 27.3860 KOps/s $\color{#35bf28}+1.34\%$
test_getitem[tuple] 0.1255ms 17.7875μs 56.2192 KOps/s 55.7958 KOps/s $\color{#35bf28}+0.76\%$
test_getitem[list] 0.1568ms 31.6825μs 31.5632 KOps/s 31.2164 KOps/s $\color{#35bf28}+1.11\%$
test_setitem_dim[int] 41.0110μs 18.0088μs 55.5285 KOps/s 56.4482 KOps/s $\color{#d91a1a}-1.63\%$
test_setitem_dim[slice_int] 58.6900μs 37.3856μs 26.7483 KOps/s 26.6188 KOps/s $\color{#35bf28}+0.49\%$
test_setitem_dim[range] 74.9410μs 50.1389μs 19.9446 KOps/s 19.6010 KOps/s $\color{#35bf28}+1.75\%$
test_setitem_dim[tuple] 50.6300μs 30.4139μs 32.8797 KOps/s 31.4872 KOps/s $\color{#35bf28}+4.42\%$
test_setitem 59.8400μs 15.3885μs 64.9836 KOps/s 65.0240 KOps/s $\color{#d91a1a}-0.06\%$
test_set 60.0510μs 14.2099μs 70.3735 KOps/s 66.5673 KOps/s $\textbf{\color{#35bf28}+5.72\%}$
test_set_shared 1.5586ms 0.1516ms 6.5962 KOps/s 6.6082 KOps/s $\color{#d91a1a}-0.18\%$
test_update 0.6421ms 17.7683μs 56.2801 KOps/s 54.6473 KOps/s $\color{#35bf28}+2.99\%$
test_update_nested 79.1200μs 23.2337μs 43.0410 KOps/s 42.0235 KOps/s $\color{#35bf28}+2.42\%$
test_update__nested 1.0325ms 25.3928μs 39.3812 KOps/s 39.1339 KOps/s $\color{#35bf28}+0.63\%$
test_set_nested 82.3210μs 16.0422μs 62.3356 KOps/s 61.0170 KOps/s $\color{#35bf28}+2.16\%$
test_set_nested_new 65.0310μs 18.4152μs 54.3028 KOps/s 54.1945 KOps/s $\color{#35bf28}+0.20\%$
test_select 84.5210μs 30.0055μs 33.3272 KOps/s 32.9180 KOps/s $\color{#35bf28}+1.24\%$
test_select_nested 71.6300μs 43.5088μs 22.9838 KOps/s 22.5932 KOps/s $\color{#35bf28}+1.73\%$
test_exclude_nested 91.7610μs 61.5148μs 16.2562 KOps/s 15.8058 KOps/s $\color{#35bf28}+2.85\%$
test_empty[True] 0.3477ms 0.2912ms 3.4339 KOps/s 3.4240 KOps/s $\color{#35bf28}+0.29\%$
test_empty[False] 4.7260μs 0.8287μs 1.2067 MOps/s 1.2032 MOps/s $\color{#35bf28}+0.30\%$
test_to 86.0510μs 55.9379μs 17.8770 KOps/s 17.1777 KOps/s $\color{#35bf28}+4.07\%$
test_to_nonblocking 86.3410μs 47.1685μs 21.2006 KOps/s 21.3889 KOps/s $\color{#d91a1a}-0.88\%$
test_unbind_speed 0.7915ms 0.2322ms 4.3070 KOps/s 4.2830 KOps/s $\color{#35bf28}+0.56\%$
test_unbind_speed_stack0 0.2876ms 0.2313ms 4.3229 KOps/s 4.2208 KOps/s $\color{#35bf28}+2.42\%$
test_unbind_speed_stack1 92.7435ms 0.6701ms 1.4924 KOps/s 1.4872 KOps/s $\color{#35bf28}+0.35\%$
test_split 93.3258ms 1.5670ms 638.1446 Ops/s 641.8231 Ops/s $\color{#d91a1a}-0.57\%$
test_chunk 95.6269ms 1.5843ms 631.1746 Ops/s 584.8946 Ops/s $\textbf{\color{#35bf28}+7.91\%}$
test_consolidate[False-None] 96.1606ms 2.8759ms 347.7146 Ops/s 377.6750 Ops/s $\textbf{\color{#d91a1a}-7.93\%}$
test_consolidate[default-None] 1.7414ms 1.6607ms 602.1547 Ops/s 601.3845 Ops/s $\color{#35bf28}+0.13\%$
test_consolidate[reduce-overhead-None] 1.7980ms 1.7067ms 585.9203 Ops/s 589.3471 Ops/s $\color{#d91a1a}-0.58\%$
test_consolidate_njt[False-None] 6.7077ms 6.4625ms 154.7397 Ops/s 156.6123 Ops/s $\color{#d91a1a}-1.20\%$
test_to[False-False-None] 1.8166ms 1.7281ms 578.6705 Ops/s 580.7515 Ops/s $\color{#d91a1a}-0.36\%$
test_to[True-False-None] 1.4801ms 1.2614ms 792.7999 Ops/s 771.2170 Ops/s $\color{#35bf28}+2.80\%$
test_to[within-False-None] 4.1484ms 4.0275ms 248.2939 Ops/s 244.5723 Ops/s $\color{#35bf28}+1.52\%$
test_to[True-default-None] 5.3955ms 5.1616ms 193.7388 Ops/s 193.5386 Ops/s $\color{#35bf28}+0.10\%$
test_to_njt[False-False-None] 7.0021ms 6.8110ms 146.8205 Ops/s 143.7690 Ops/s $\color{#35bf28}+2.12\%$
test_to_njt[True-False-None] 5.5646ms 5.4007ms 185.1612 Ops/s 184.2728 Ops/s $\color{#35bf28}+0.48\%$
test_to_njt[within-False-None] 12.0056ms 11.8612ms 84.3084 Ops/s 83.9617 Ops/s $\color{#35bf28}+0.41\%$
test_creation[device0] 0.4702ms 78.3042μs 12.7707 KOps/s 11.9571 KOps/s $\textbf{\color{#35bf28}+6.80\%}$
test_creation_from_tensor 0.4567ms 82.3682μs 12.1406 KOps/s 12.0119 KOps/s $\color{#35bf28}+1.07\%$
test_add_one[memmap_tensor0] 0.4235ms 6.5889μs 151.7711 KOps/s 151.0365 KOps/s $\color{#35bf28}+0.49\%$
test_contiguous[memmap_tensor0] 2.4956μs 0.4159μs 2.4044 MOps/s 2.4234 MOps/s $\color{#d91a1a}-0.78\%$
test_stack[memmap_tensor0] 48.7410μs 4.2584μs 234.8301 KOps/s 232.6309 KOps/s $\color{#35bf28}+0.95\%$
test_memmaptd_index 1.9028ms 0.2443ms 4.0931 KOps/s 4.1558 KOps/s $\color{#d91a1a}-1.51\%$
test_memmaptd_index_astensor 0.5841ms 0.3063ms 3.2648 KOps/s 3.3038 KOps/s $\color{#d91a1a}-1.18\%$
test_memmaptd_index_op 1.0517ms 0.5800ms 1.7242 KOps/s 1.7296 KOps/s $\color{#d91a1a}-0.32\%$
test_serialize_model 0.1312s 0.1299s 7.7000 Ops/s 7.6000 Ops/s $\color{#35bf28}+1.32\%$
test_serialize_model_pickle 1.3500s 1.1897s 0.8406 Ops/s 0.8206 Ops/s $\color{#35bf28}+2.43\%$
test_serialize_weights 0.1306s 0.1300s 7.6947 Ops/s 7.6648 Ops/s $\color{#35bf28}+0.39\%$
test_serialize_weights_returnearly 0.3257s 54.7666ms 18.2593 Ops/s 23.0654 Ops/s $\textbf{\color{#d91a1a}-20.84\%}$
test_serialize_weights_pickle 1.3912s 1.2224s 0.8181 Ops/s 0.8223 Ops/s $\color{#d91a1a}-0.51\%$
test_reshape_pytree 52.3910μs 21.5158μs 46.4775 KOps/s 45.3733 KOps/s $\color{#35bf28}+2.43\%$
test_reshape_td 61.6600μs 25.7712μs 38.8030 KOps/s 37.0026 KOps/s $\color{#35bf28}+4.87\%$
test_view_pytree 55.2100μs 21.5056μs 46.4996 KOps/s 46.3725 KOps/s $\color{#35bf28}+0.27\%$
test_view_td 63.2310μs 30.9787μs 32.2802 KOps/s 31.0953 KOps/s $\color{#35bf28}+3.81\%$
test_unbind_pytree 58.9310μs 27.4867μs 36.3813 KOps/s 35.9661 KOps/s $\color{#35bf28}+1.15\%$
test_unbind_td 0.8216ms 35.9159μs 27.8428 KOps/s 27.1941 KOps/s $\color{#35bf28}+2.39\%$
test_split_pytree 80.2510μs 29.5137μs 33.8826 KOps/s 32.7789 KOps/s $\color{#35bf28}+3.37\%$
test_split_td 1.0025ms 37.8828μs 26.3972 KOps/s 25.8480 KOps/s $\color{#35bf28}+2.12\%$
test_add_pytree 60.8510μs 34.0447μs 29.3731 KOps/s 29.1834 KOps/s $\color{#35bf28}+0.65\%$
test_add_td 86.0010μs 48.5788μs 20.5851 KOps/s 20.3708 KOps/s $\color{#35bf28}+1.05\%$
test_compile_add_one_nested[tensordict-compile] 0.1707ms 0.1195ms 8.3689 KOps/s 8.0915 KOps/s $\color{#35bf28}+3.43\%$
test_compile_add_one_nested[tensordict-eager] 0.2253ms 0.1306ms 7.6571 KOps/s 7.6286 KOps/s $\color{#35bf28}+0.37\%$
test_compile_add_one_nested[pytree-compile] 0.1341ms 95.2122μs 10.5029 KOps/s 10.4605 KOps/s $\color{#35bf28}+0.41\%$
test_compile_add_one_nested[pytree-eager] 0.1959ms 0.1488ms 6.7198 KOps/s 6.3551 KOps/s $\textbf{\color{#35bf28}+5.74\%}$
test_compile_copy_nested[tensordict-compile] 61.5310μs 22.0593μs 45.3323 KOps/s 42.5100 KOps/s $\textbf{\color{#35bf28}+6.64\%}$
test_compile_copy_nested[tensordict-eager] 63.3210μs 28.8569μs 34.6537 KOps/s 34.0985 KOps/s $\color{#35bf28}+1.63\%$
test_compile_copy_nested[pytree-compile] 0.1549ms 61.5226μs 16.2542 KOps/s 15.5906 KOps/s $\color{#35bf28}+4.26\%$
test_compile_copy_nested[pytree-eager] 79.5210μs 48.2092μs 20.7429 KOps/s 20.4843 KOps/s $\color{#35bf28}+1.26\%$
test_compile_add_one_flat[tensordict-compile] 0.1809ms 0.1409ms 7.0951 KOps/s 6.9981 KOps/s $\color{#35bf28}+1.39\%$
test_compile_add_one_flat[tensordict-eager] 0.3160ms 0.2176ms 4.5953 KOps/s 4.6168 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_add_one_flat[tensorclass-compile] 0.1390ms 96.4056μs 10.3728 KOps/s 10.2063 KOps/s $\color{#35bf28}+1.63\%$
test_compile_add_one_flat[tensorclass-eager] 0.1139ms 54.5201μs 18.3419 KOps/s 18.0735 KOps/s $\color{#35bf28}+1.48\%$
test_compile_add_one_flat[pytree-compile] 0.1838ms 0.1346ms 7.4319 KOps/s 7.3202 KOps/s $\color{#35bf28}+1.53\%$
test_compile_add_one_flat[pytree-eager] 0.5358ms 0.4798ms 2.0840 KOps/s 2.0722 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_self_flat[tensordict-eager] 0.4291ms 0.2624ms 3.8107 KOps/s 3.8166 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_add_self_flat[tensordict-compile] 0.1866ms 0.1425ms 7.0153 KOps/s 6.9515 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_self_flat[tensorclass-eager] 0.1699ms 68.2294μs 14.6564 KOps/s 14.7873 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_add_self_flat[tensorclass-compile] 0.1379ms 99.8503μs 10.0150 KOps/s 10.0425 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_add_self_flat[pytree-eager] 0.4744ms 0.4092ms 2.4440 KOps/s 2.4196 KOps/s $\color{#35bf28}+1.01\%$
test_compile_add_self_flat[pytree-compile] 0.1735ms 0.1340ms 7.4630 KOps/s 7.4516 KOps/s $\color{#35bf28}+0.15\%$
test_compile_copy_flat[tensordict-compile] 61.1110μs 18.5762μs 53.8325 KOps/s 54.9604 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_copy_flat[tensordict-eager] 0.1356ms 31.2685μs 31.9810 KOps/s 31.5902 KOps/s $\color{#35bf28}+1.24\%$
test_compile_copy_flat[pytree-compile] 0.1018ms 69.8348μs 14.3195 KOps/s 14.3618 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_copy_flat[pytree-eager] 0.1574ms 50.8125μs 19.6802 KOps/s 19.4560 KOps/s $\color{#35bf28}+1.15\%$
test_compile_assign_and_add[tensordict-compile] 1.6466ms 0.3932ms 2.5435 KOps/s 2.2166 KOps/s $\textbf{\color{#35bf28}+14.74\%}$
test_compile_assign_and_add[tensordict-eager] 2.8041ms 2.6182ms 381.9356 Ops/s 385.3983 Ops/s $\color{#d91a1a}-0.90\%$
test_compile_assign_and_add[pytree-compile] 1.5581ms 0.4226ms 2.3666 KOps/s 2.1802 KOps/s $\textbf{\color{#35bf28}+8.55\%}$
test_compile_assign_and_add[pytree-eager] 2.8240ms 2.6402ms 378.7528 Ops/s 368.5434 Ops/s $\color{#35bf28}+2.77\%$
test_compile_indexing[tensor-tensordict-compile] 0.1685ms 0.1166ms 8.5780 KOps/s 8.3635 KOps/s $\color{#35bf28}+2.57\%$
test_compile_indexing[tensor-tensordict-eager] 0.5671ms 81.7840μs 12.2273 KOps/s 12.0907 KOps/s $\color{#35bf28}+1.13\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1653ms 0.1119ms 8.9347 KOps/s 9.0228 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1310ms 70.9641μs 14.0916 KOps/s 14.0076 KOps/s $\color{#35bf28}+0.60\%$
test_compile_indexing[tensor-pytree-compile] 0.1586ms 0.1130ms 8.8502 KOps/s 8.9395 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_indexing[tensor-pytree-eager] 0.1578ms 69.6447μs 14.3586 KOps/s 14.3437 KOps/s $\color{#35bf28}+0.10\%$
test_compile_indexing[slice-tensordict-compile] 0.1726ms 99.2066μs 10.0800 KOps/s 10.0267 KOps/s $\color{#35bf28}+0.53\%$
test_compile_indexing[slice-tensordict-eager] 0.1451ms 16.8482μs 59.3534 KOps/s 59.2278 KOps/s $\color{#35bf28}+0.21\%$
test_compile_indexing[slice-tensorclass-compile] 0.1408ms 95.0165μs 10.5245 KOps/s 10.4107 KOps/s $\color{#35bf28}+1.09\%$
test_compile_indexing[slice-tensorclass-eager] 45.4210μs 15.4719μs 64.6333 KOps/s 54.3181 KOps/s $\textbf{\color{#35bf28}+18.99\%}$
test_compile_indexing[slice-pytree-compile] 0.1634ms 97.2179μs 10.2862 KOps/s 10.4021 KOps/s $\color{#d91a1a}-1.11\%$
test_compile_indexing[slice-pytree-eager] 42.5500μs 15.3754μs 65.0391 KOps/s 64.6313 KOps/s $\color{#35bf28}+0.63\%$
test_compile_indexing[int-tensordict-compile] 0.1742ms 0.1002ms 9.9829 KOps/s 9.9988 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_indexing[int-tensordict-eager] 0.6538ms 16.3449μs 61.1812 KOps/s 59.0405 KOps/s $\color{#35bf28}+3.63\%$
test_compile_indexing[int-tensorclass-compile] 0.1368ms 95.2816μs 10.4952 KOps/s 10.4152 KOps/s $\color{#35bf28}+0.77\%$
test_compile_indexing[int-tensorclass-eager] 44.6300μs 15.4021μs 64.9264 KOps/s 64.7556 KOps/s $\color{#35bf28}+0.26\%$
test_compile_indexing[int-pytree-compile] 0.1367ms 95.4694μs 10.4746 KOps/s 10.4304 KOps/s $\color{#35bf28}+0.42\%$
test_compile_indexing[int-pytree-eager] 0.4078ms 15.2432μs 65.6031 KOps/s 64.7530 KOps/s $\color{#35bf28}+1.31\%$
test_mod_add[eager] 83.4710μs 37.2062μs 26.8772 KOps/s 26.2211 KOps/s $\color{#35bf28}+2.50\%$
test_mod_add[compile] 0.1409ms 77.3299μs 12.9316 KOps/s 12.7840 KOps/s $\color{#35bf28}+1.15\%$
test_mod_add[compile-overhead] 0.3165ms 0.1627ms 6.1466 KOps/s 5.7768 KOps/s $\textbf{\color{#35bf28}+6.40\%}$
test_mod_wrap[eager] 0.3259ms 0.2458ms 4.0682 KOps/s 3.9393 KOps/s $\color{#35bf28}+3.27\%$
test_mod_wrap[compile] 0.3301ms 0.2758ms 3.6253 KOps/s 3.5398 KOps/s $\color{#35bf28}+2.42\%$
test_mod_wrap[compile-overhead] 7.2760ms 3.8186ms 261.8732 Ops/s 266.2174 Ops/s $\color{#d91a1a}-1.63\%$
test_mod_wrap_and_backward[eager] 1.4728ms 1.3591ms 735.7725 Ops/s 704.1571 Ops/s $\color{#35bf28}+4.49\%$
test_mod_wrap_and_backward[compile] 1.3444ms 1.2423ms 804.9840 Ops/s 793.3029 Ops/s $\color{#35bf28}+1.47\%$
test_mod_wrap_and_backward[compile-overhead] 1.3542ms 0.9182ms 1.0891 KOps/s 1.0911 KOps/s $\color{#d91a1a}-0.18\%$
test_seq_add[eager] 0.1716ms 0.1190ms 8.4015 KOps/s 8.5123 KOps/s $\color{#d91a1a}-1.30\%$
test_seq_add[compile] 0.1577ms 90.2273μs 11.0831 KOps/s 11.6253 KOps/s $\color{#d91a1a}-4.66\%$
test_seq_add[compile-overhead] 0.2082ms 0.1295ms 7.7248 KOps/s 7.8448 KOps/s $\color{#d91a1a}-1.53\%$
test_seq_wrap[eager] 0.4847ms 0.4297ms 2.3273 KOps/s 2.3588 KOps/s $\color{#d91a1a}-1.33\%$
test_seq_wrap[compile] 0.3785ms 0.3016ms 3.3159 KOps/s 3.3539 KOps/s $\color{#d91a1a}-1.13\%$
test_seq_wrap[compile-overhead] 0.3216ms 0.2234ms 4.4768 KOps/s 4.4706 KOps/s $\color{#35bf28}+0.14\%$
test_func_call_runtime[False-eager] 0.8457ms 0.7410ms 1.3494 KOps/s 1.3447 KOps/s $\color{#35bf28}+0.35\%$
test_func_call_runtime[False-compile] 1.3366ms 0.7313ms 1.3675 KOps/s 1.3595 KOps/s $\color{#35bf28}+0.59\%$
test_func_call_runtime[False-compile-overhead] 0.4008ms 0.3560ms 2.8089 KOps/s 2.8130 KOps/s $\color{#d91a1a}-0.15\%$
test_func_call_runtime[True-eager] 0.9374ms 0.8911ms 1.1223 KOps/s 1.0983 KOps/s $\color{#35bf28}+2.19\%$
test_func_call_runtime[True-compile] 0.8025ms 0.7453ms 1.3418 KOps/s 1.3295 KOps/s $\color{#35bf28}+0.92\%$
test_func_call_runtime[True-compile-overhead] 0.4624ms 0.3772ms 2.6508 KOps/s 2.6598 KOps/s $\color{#d91a1a}-0.34\%$
test_func_call_cm_runtime[False-eager] 0.7811ms 0.7323ms 1.3656 KOps/s 1.3567 KOps/s $\color{#35bf28}+0.65\%$
test_func_call_cm_runtime[False-compile] 0.8046ms 0.7305ms 1.3689 KOps/s 1.3523 KOps/s $\color{#35bf28}+1.23\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5019ms 0.3590ms 2.7854 KOps/s 2.8007 KOps/s $\color{#d91a1a}-0.54\%$
test_func_call_cm_runtime[True-eager] 1.1180ms 0.9910ms 1.0091 KOps/s 977.6131 Ops/s $\color{#35bf28}+3.22\%$
test_func_call_cm_runtime[True-compile] 0.8819ms 0.7775ms 1.2862 KOps/s 1.2815 KOps/s $\color{#35bf28}+0.36\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4809ms 0.4042ms 2.4738 KOps/s 2.4694 KOps/s $\color{#35bf28}+0.18\%$
test_vmap_func_call_cm_runtime[eager] 2.5696ms 2.0890ms 478.7003 Ops/s 474.8063 Ops/s $\color{#35bf28}+0.82\%$
test_vmap_func_call_cm_runtime[compile] 0.8802ms 0.7888ms 1.2678 KOps/s 1.2500 KOps/s $\color{#35bf28}+1.42\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4528ms 0.4051ms 2.4685 KOps/s 2.4547 KOps/s $\color{#35bf28}+0.56\%$
test_distributed 2.1695ms 0.1810ms 5.5247 KOps/s 8.3600 KOps/s $\textbf{\color{#d91a1a}-33.92\%}$
test_tdmodule 62.1410μs 20.5289μs 48.7117 KOps/s 49.2836 KOps/s $\color{#d91a1a}-1.16\%$
test_tdmodule_dispatch 75.3510μs 36.9682μs 27.0502 KOps/s 27.4286 KOps/s $\color{#d91a1a}-1.38\%$
test_tdseq 42.9210μs 21.2880μs 46.9748 KOps/s 47.1724 KOps/s $\color{#d91a1a}-0.42\%$
test_tdseq_dispatch 61.6110μs 38.9192μs 25.6943 KOps/s 25.8499 KOps/s $\color{#d91a1a}-0.60\%$
test_instantiation_functorch 1.5959ms 1.5113ms 661.6825 Ops/s 657.5310 Ops/s $\color{#35bf28}+0.63\%$
test_exec_functorch 0.1854ms 0.1425ms 7.0168 KOps/s 6.8406 KOps/s $\color{#35bf28}+2.58\%$
test_exec_functional_call 0.1761ms 0.1339ms 7.4701 KOps/s 7.1643 KOps/s $\color{#35bf28}+4.27\%$
test_exec_td_decorator 0.3725ms 0.1830ms 5.4656 KOps/s 5.3771 KOps/s $\color{#35bf28}+1.65\%$
test_vmap_mlp_speed_decorator[True-True] 0.7487ms 0.6829ms 1.4643 KOps/s 1.4469 KOps/s $\color{#35bf28}+1.20\%$
test_vmap_mlp_speed_decorator[True-False] 0.8644ms 0.6803ms 1.4700 KOps/s 1.4379 KOps/s $\color{#35bf28}+2.23\%$
test_vmap_mlp_speed_decorator[False-True] 0.7318ms 0.5986ms 1.6707 KOps/s 1.6046 KOps/s $\color{#35bf28}+4.11\%$
test_vmap_mlp_speed_decorator[False-False] 0.6989ms 0.5953ms 1.6797 KOps/s 1.6515 KOps/s $\color{#35bf28}+1.71\%$
test_vmap_transformer_speed_decorator[True-True] 19.2204ms 19.1402ms 52.2461 Ops/s 51.7358 Ops/s $\color{#35bf28}+0.99\%$
test_vmap_transformer_speed_decorator[True-False] 19.3373ms 19.2150ms 52.0426 Ops/s 51.6924 Ops/s $\color{#35bf28}+0.68\%$
test_vmap_transformer_speed_decorator[False-True] 19.1593ms 19.0805ms 52.4096 Ops/s 52.0831 Ops/s $\color{#35bf28}+0.63\%$
test_vmap_transformer_speed_decorator[False-False] 19.1679ms 19.0853ms 52.3965 Ops/s 52.2161 Ops/s $\color{#35bf28}+0.35\%$
test_to_module_speed[True] 1.0717ms 0.9827ms 1.0176 KOps/s 1.0429 KOps/s $\color{#d91a1a}-2.42\%$
test_to_module_speed[False] 1.3528ms 0.9639ms 1.0375 KOps/s 1.0537 KOps/s $\color{#d91a1a}-1.54\%$
test_tc_init 63.0100μs 37.0954μs 26.9575 KOps/s 28.1417 KOps/s $\color{#d91a1a}-4.21\%$
test_tc_init_nested 0.1116ms 72.5524μs 13.7831 KOps/s 13.9964 KOps/s $\color{#d91a1a}-1.52\%$
test_tc_first_layer_tensor 5.6471μs 0.7026μs 1.4233 MOps/s 1.4415 MOps/s $\color{#d91a1a}-1.26\%$
test_tc_first_layer_nontensor 21.0500μs 2.2485μs 444.7352 KOps/s 449.8133 KOps/s $\color{#d91a1a}-1.13\%$
test_tc_second_layer_tensor 30.9927μs 1.4328μs 697.9476 KOps/s 705.8981 KOps/s $\color{#d91a1a}-1.13\%$
test_tc_second_layer_nontensor 38.6400μs 2.9789μs 335.6970 KOps/s 337.0006 KOps/s $\color{#d91a1a}-0.39\%$
test_unbind 0.2324s 10.1457ms 98.5643 Ops/s 145.0287 Ops/s $\textbf{\color{#d91a1a}-32.04\%}$
test_full_like 9.5783ms 9.0913ms 109.9953 Ops/s 109.7728 Ops/s $\color{#35bf28}+0.20\%$
test_zeros_like 4.9642ms 4.3162ms 231.6855 Ops/s 137.3891 Ops/s $\textbf{\color{#35bf28}+68.63\%}$
test_ones_like 4.4086ms 4.3160ms 231.6938 Ops/s 231.1312 Ops/s $\color{#35bf28}+0.24\%$
test_clone 6.4009ms 6.3020ms 158.6794 Ops/s 157.4017 Ops/s $\color{#35bf28}+0.81\%$
test_squeeze 58.7500μs 10.3570μs 96.5527 KOps/s 105.0259 KOps/s $\textbf{\color{#d91a1a}-8.07\%}$
test_unsqueeze 0.1212ms 71.5098μs 13.9841 KOps/s 13.7398 KOps/s $\color{#35bf28}+1.78\%$
test_split 0.3898ms 0.1639ms 6.1001 KOps/s 6.1645 KOps/s $\color{#d91a1a}-1.05\%$
test_permute 0.2449ms 0.1811ms 5.5222 KOps/s 5.4594 KOps/s $\color{#35bf28}+1.15\%$
test_stack 50.9052ms 50.3111ms 19.8763 Ops/s 19.8402 Ops/s $\color{#35bf28}+0.18\%$
test_cat 50.3740ms 50.1434ms 19.9428 Ops/s 23.7783 Ops/s $\textbf{\color{#d91a1a}-16.13\%}$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] pad_sequence with batch_size=1 does not encapsulate non-tensor data into an iterable
2 participants