-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Use correct default cuda device #1161
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Jan 7, 2025
ghstack-source-id: 9afb5b03ddf75afec357e9e54caadfc92ebf4ded Pull Request resolved: #1161
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 7, 2025
3 tasks
vmoens
added a commit
that referenced
this pull request
Jan 7, 2025
ghstack-source-id: 9afb5b03ddf75afec357e9e54caadfc92ebf4ded Pull Request resolved: #1161
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 51.1560μs | 22.5351μs | 44.3752 KOps/s | 47.5697 KOps/s | |
test_plain_set_stack_nested | 66.8650μs | 22.4793μs | 44.4854 KOps/s | 46.7984 KOps/s | |
test_plain_set_nested_inplace | 56.7560μs | 24.0769μs | 41.5336 KOps/s | 41.0582 KOps/s | |
test_plain_set_stack_nested_inplace | 79.6300μs | 24.1452μs | 41.4161 KOps/s | 42.8944 KOps/s | |
test_items | 25.1470μs | 4.1751μs | 239.5155 KOps/s | 237.6470 KOps/s | |
test_items_nested | 0.6204ms | 0.4024ms | 2.4849 KOps/s | 2.4560 KOps/s | |
test_items_nested_locked | 0.8290ms | 0.4041ms | 2.4748 KOps/s | 2.4364 KOps/s | |
test_items_nested_leaf | 0.1474ms | 77.3365μs | 12.9305 KOps/s | 12.8379 KOps/s | |
test_items_stack_nested | 0.8339ms | 0.4048ms | 2.4702 KOps/s | 2.4538 KOps/s | |
test_items_stack_nested_leaf | 0.1500ms | 78.8436μs | 12.6833 KOps/s | 12.5081 KOps/s | |
test_items_stack_nested_locked | 0.5260ms | 0.4059ms | 2.4636 KOps/s | 2.4313 KOps/s | |
test_keys | 23.7340μs | 3.4975μs | 285.9178 KOps/s | 279.5141 KOps/s | |
test_keys_nested | 0.2266ms | 0.1636ms | 6.1133 KOps/s | 6.0532 KOps/s | |
test_keys_nested_locked | 0.8332ms | 0.1710ms | 5.8479 KOps/s | 5.7794 KOps/s | |
test_keys_nested_leaf | 0.2238ms | 0.1431ms | 6.9858 KOps/s | 6.8072 KOps/s | |
test_keys_stack_nested | 0.3428ms | 0.1650ms | 6.0609 KOps/s | 5.9797 KOps/s | |
test_keys_stack_nested_leaf | 0.2068ms | 0.1425ms | 7.0188 KOps/s | 6.9653 KOps/s | |
test_keys_stack_nested_locked | 0.3122ms | 0.1708ms | 5.8545 KOps/s | 5.7856 KOps/s | |
test_values | 9.8466μs | 1.0625μs | 941.1791 KOps/s | 914.7289 KOps/s | |
test_values_nested | 0.1354ms | 62.3268μs | 16.0445 KOps/s | 15.8808 KOps/s | |
test_values_nested_locked | 0.1198ms | 62.1124μs | 16.0999 KOps/s | 15.8111 KOps/s | |
test_values_nested_leaf | 0.1583ms | 71.7873μs | 13.9300 KOps/s | 13.3383 KOps/s | |
test_values_stack_nested | 0.1215ms | 62.5333μs | 15.9915 KOps/s | 15.7583 KOps/s | |
test_values_stack_nested_leaf | 0.1268ms | 71.3636μs | 14.0128 KOps/s | 13.5710 KOps/s | |
test_values_stack_nested_locked | 0.1249ms | 63.4397μs | 15.7630 KOps/s | 15.6838 KOps/s | |
test_membership | 2.5688μs | 0.6980μs | 1.4327 MOps/s | 1.0864 MOps/s | |
test_membership_nested | 41.9290μs | 2.9709μs | 336.6016 KOps/s | 335.6417 KOps/s | |
test_membership_nested_leaf | 28.4840μs | 2.9429μs | 339.8024 KOps/s | 337.0828 KOps/s | |
test_membership_stacked_nested | 44.7940μs | 2.9201μs | 342.4536 KOps/s | 331.6355 KOps/s | |
test_membership_stacked_nested_leaf | 27.2510μs | 2.9032μs | 344.4430 KOps/s | 334.0766 KOps/s | |
test_membership_nested_last | 35.9270μs | 4.3820μs | 228.2047 KOps/s | 224.2816 KOps/s | |
test_membership_nested_leaf_last | 29.4050μs | 4.4280μs | 225.8373 KOps/s | 222.7880 KOps/s | |
test_membership_stacked_nested_last | 33.1320μs | 4.4470μs | 224.8683 KOps/s | 224.7291 KOps/s | |
test_membership_stacked_nested_leaf_last | 47.1780μs | 4.4900μs | 222.7186 KOps/s | 226.3257 KOps/s | |
test_nested_getleaf | 53.2200μs | 10.8259μs | 92.3707 KOps/s | 92.6302 KOps/s | |
test_nested_get | 36.9690μs | 10.2930μs | 97.1532 KOps/s | 99.4984 KOps/s | |
test_stacked_getleaf | 35.4460μs | 10.6700μs | 93.7207 KOps/s | 93.4567 KOps/s | |
test_stacked_get | 55.4740μs | 10.2958μs | 97.1273 KOps/s | 97.8267 KOps/s | |
test_nested_getitemleaf | 61.7690μs | 11.1832μs | 89.4201 KOps/s | 89.7020 KOps/s | |
test_nested_getitem | 31.8500μs | 10.4766μs | 95.4510 KOps/s | 96.2504 KOps/s | |
test_stacked_getitemleaf | 61.7850μs | 10.7860μs | 92.7124 KOps/s | 88.8494 KOps/s | |
test_stacked_getitem | 35.3260μs | 10.4255μs | 95.9189 KOps/s | 96.1860 KOps/s | |
test_lock_nested | 6.8719ms | 0.4735ms | 2.1121 KOps/s | 2.1847 KOps/s | |
test_lock_stack_nested | 0.7278ms | 0.4364ms | 2.2916 KOps/s | 2.3239 KOps/s | |
test_unlock_nested | 0.9233ms | 0.3875ms | 2.5806 KOps/s | 2.6492 KOps/s | |
test_unlock_stack_nested | 0.6580ms | 0.3554ms | 2.8135 KOps/s | 2.8729 KOps/s | |
test_flatten_speed | 0.1757ms | 0.1005ms | 9.9536 KOps/s | 9.9474 KOps/s | |
test_unflatten_speed | 0.6815ms | 0.5323ms | 1.8787 KOps/s | 1.8383 KOps/s | |
test_common_ops | 1.7740ms | 0.8290ms | 1.2063 KOps/s | 1.2919 KOps/s | |
test_creation | 19.7370μs | 2.4873μs | 402.0424 KOps/s | 394.1277 KOps/s | |
test_creation_empty | 35.8970μs | 13.2192μs | 75.6478 KOps/s | 91.8668 KOps/s | |
test_creation_nested_1 | 44.7940μs | 16.1109μs | 62.0697 KOps/s | 72.3997 KOps/s | |
test_creation_nested_2 | 74.1560μs | 20.5064μs | 48.7654 KOps/s | 54.8130 KOps/s | |
test_clone | 57.2670μs | 13.4182μs | 74.5254 KOps/s | 72.6537 KOps/s | |
test_getitem[int] | 1.2903ms | 13.2139μs | 75.6780 KOps/s | 77.5645 KOps/s | |
test_getitem[slice_int] | 0.1468ms | 25.1435μs | 39.7718 KOps/s | 41.4749 KOps/s | |
test_getitem[range] | 0.1781ms | 48.9485μs | 20.4296 KOps/s | 20.6459 KOps/s | |
test_getitem[tuple] | 0.1391ms | 20.8199μs | 48.0310 KOps/s | 49.1941 KOps/s | |
test_getitem[list] | 0.3046ms | 44.4635μs | 22.4904 KOps/s | 23.5438 KOps/s | |
test_setitem_dim[int] | 64.4510μs | 25.0852μs | 39.8641 KOps/s | 39.9649 KOps/s | |
test_setitem_dim[slice_int] | 93.0350μs | 51.6956μs | 19.3440 KOps/s | 19.9817 KOps/s | |
test_setitem_dim[range] | 0.1264ms | 74.5407μs | 13.4155 KOps/s | 13.7392 KOps/s | |
test_setitem_dim[tuple] | 88.9870μs | 40.6694μs | 24.5885 KOps/s | 25.4883 KOps/s | |
test_setitem | 0.1426ms | 21.7150μs | 46.0511 KOps/s | 49.3700 KOps/s | |
test_set | 0.1177ms | 21.3153μs | 46.9146 KOps/s | 50.3841 KOps/s | |
test_set_shared | 3.7082ms | 0.1734ms | 5.7678 KOps/s | 5.7862 KOps/s | |
test_update | 0.1763ms | 25.5489μs | 39.1406 KOps/s | 45.7166 KOps/s | |
test_update_nested | 0.1518ms | 35.6299μs | 28.0663 KOps/s | 30.9256 KOps/s | |
test_update__nested | 0.5394ms | 34.6944μs | 28.8231 KOps/s | 29.3627 KOps/s | |
test_set_nested | 81.9420μs | 23.6226μs | 42.3323 KOps/s | 45.6352 KOps/s | |
test_set_nested_new | 0.1195ms | 28.4904μs | 35.0995 KOps/s | 37.1322 KOps/s | |
test_select | 0.1443ms | 45.0277μs | 22.2086 KOps/s | 22.8965 KOps/s | |
test_select_nested | 0.1253ms | 62.6160μs | 15.9704 KOps/s | 15.3775 KOps/s | |
test_exclude_nested | 0.1485ms | 81.2071μs | 12.3142 KOps/s | 12.1456 KOps/s | |
test_empty[True] | 0.6916ms | 0.4083ms | 2.4491 KOps/s | 2.3996 KOps/s | |
test_empty[False] | 13.7358μs | 1.4261μs | 701.1886 KOps/s | 713.3399 KOps/s | |
test_unbind_speed | 0.4804ms | 0.2787ms | 3.5878 KOps/s | 3.6946 KOps/s | |
test_unbind_speed_stack0 | 0.4461ms | 0.2775ms | 3.6039 KOps/s | 3.7021 KOps/s | |
test_unbind_speed_stack1 | 0.1129s | 0.8364ms | 1.1957 KOps/s | 1.4921 KOps/s | |
test_split | 2.5217ms | 1.6137ms | 619.6866 Ops/s | 560.4651 Ops/s | |
test_chunk | 0.1126s | 1.9692ms | 507.8129 Ops/s | 561.5805 Ops/s | |
test_consolidate_njt[False-None] | 9.5909ms | 8.3621ms | 119.5873 Ops/s | 120.8374 Ops/s | |
test_creation[device0] | 0.2866ms | 91.4746μs | 10.9320 KOps/s | 10.7942 KOps/s | |
test_creation_from_tensor | 4.7156ms | 95.8346μs | 10.4346 KOps/s | 10.0323 KOps/s | |
test_add_one[memmap_tensor0] | 0.1503ms | 5.2123μs | 191.8542 KOps/s | 202.2188 KOps/s | |
test_contiguous[memmap_tensor0] | 12.4330μs | 0.5309μs | 1.8836 MOps/s | 1.8836 MOps/s | |
test_stack[memmap_tensor0] | 60.0620μs | 3.5750μs | 279.7236 KOps/s | 281.2384 KOps/s | |
test_memmaptd_index | 1.1016ms | 0.2450ms | 4.0824 KOps/s | 4.1706 KOps/s | |
test_memmaptd_index_astensor | 0.6109ms | 0.3337ms | 2.9964 KOps/s | 3.0596 KOps/s | |
test_memmaptd_index_op | 1.0577ms | 0.6416ms | 1.5587 KOps/s | 1.6816 KOps/s | |
test_serialize_model | 0.1267s | 0.1190s | 8.4027 Ops/s | 7.3436 Ops/s | |
test_serialize_model_pickle | 0.4438s | 0.3859s | 2.5915 Ops/s | 2.5160 Ops/s | |
test_serialize_weights | 0.1276s | 0.1173s | 8.5260 Ops/s | 8.5367 Ops/s | |
test_serialize_weights_returnearly | 0.1821s | 0.1630s | 6.1334 Ops/s | 6.3767 Ops/s | |
test_serialize_weights_pickle | 0.5383s | 0.4430s | 2.2575 Ops/s | 2.4767 Ops/s | |
test_serialize_weights_filesystem | 0.1524s | 0.1460s | 6.8482 Ops/s | 6.9318 Ops/s | |
test_serialize_model_filesystem | 0.1627s | 0.1521s | 6.5742 Ops/s | 6.3939 Ops/s | |
test_reshape_pytree | 57.9790μs | 26.7320μs | 37.4083 KOps/s | 37.3594 KOps/s | |
test_reshape_td | 83.9470μs | 33.7528μs | 29.6271 KOps/s | 30.3985 KOps/s | |
test_view_pytree | 70.3820μs | 26.9967μs | 37.0415 KOps/s | 37.3066 KOps/s | |
test_view_td | 95.9700μs | 38.0381μs | 26.2895 KOps/s | 25.6340 KOps/s | |
test_unbind_pytree | 92.3330μs | 29.7503μs | 33.6132 KOps/s | 33.3326 KOps/s | |
test_unbind_td | 0.3678ms | 41.2075μs | 24.2674 KOps/s | 25.5747 KOps/s | |
test_split_pytree | 61.7150μs | 29.7283μs | 33.6379 KOps/s | 34.1518 KOps/s | |
test_split_td | 0.6003ms | 45.4562μs | 21.9992 KOps/s | 22.3536 KOps/s | |
test_add_pytree | 88.6460μs | 35.6102μs | 28.0819 KOps/s | 28.0746 KOps/s | |
test_add_td | 0.1467ms | 60.9259μs | 16.4134 KOps/s | 18.0976 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1265ms | 63.2435μs | 15.8119 KOps/s | 15.7418 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4521ms | 0.1751ms | 5.7105 KOps/s | 5.9110 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1598ms | 46.1125μs | 21.6861 KOps/s | 22.0663 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2556ms | 0.1200ms | 8.3317 KOps/s | 8.4309 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 86.8830μs | 25.7203μs | 38.8799 KOps/s | 37.9865 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1292ms | 59.1181μs | 16.9153 KOps/s | 16.9534 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1658ms | 77.7741μs | 12.8577 KOps/s | 12.5977 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1422ms | 67.4701μs | 14.8214 KOps/s | 14.6598 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1900ms | 0.1064ms | 9.3986 KOps/s | 9.4933 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4609ms | 0.2220ms | 4.5047 KOps/s | 4.7070 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 99.0150μs | 45.3961μs | 22.0283 KOps/s | 22.1877 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5749ms | 68.2118μs | 14.6602 KOps/s | 15.7102 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1949ms | 0.1053ms | 9.4965 KOps/s | 9.8233 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4097ms | 0.2046ms | 4.8880 KOps/s | 4.9453 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4853ms | 0.2332ms | 4.2880 KOps/s | 4.3105 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2004ms | 0.1058ms | 9.4494 KOps/s | 9.4713 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1555ms | 60.7217μs | 16.4686 KOps/s | 17.1479 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1033ms | 49.0355μs | 20.3934 KOps/s | 21.0376 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2514ms | 0.1611ms | 6.2068 KOps/s | 6.3457 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1943ms | 0.1076ms | 9.2958 KOps/s | 9.5102 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 66.7760μs | 22.6185μs | 44.2116 KOps/s | 46.4033 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1462ms | 66.1071μs | 15.1270 KOps/s | 15.3560 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1748ms | 80.7990μs | 12.3764 KOps/s | 12.3707 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1566ms | 68.8963μs | 14.5146 KOps/s | 14.5927 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4221ms | 0.2158ms | 4.6341 KOps/s | 4.7858 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.4794ms | 1.3418ms | 745.2697 Ops/s | 762.9671 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3074ms | 0.2070ms | 4.8311 KOps/s | 4.9290 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0370ms | 0.7967ms | 1.2552 KOps/s | 1.2854 KOps/s | |
test_compile_assign_and_add_stack[compile] | 1.0400ms | 0.4693ms | 2.1307 KOps/s | 2.1852 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.9153ms | 2.9981ms | 333.5419 Ops/s | 379.7694 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 90.1890μs | 37.3387μs | 26.7819 KOps/s | 27.9863 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5922ms | 35.0066μs | 28.5661 KOps/s | 30.5827 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 98.7750μs | 30.3378μs | 32.9621 KOps/s | 34.7493 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1106ms | 23.4013μs | 42.7326 KOps/s | 40.2818 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 81.7230μs | 31.3010μs | 31.9479 KOps/s | 34.1422 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 62.9980μs | 23.4776μs | 42.5938 KOps/s | 43.3736 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1472ms | 53.8731μs | 18.5621 KOps/s | 19.9375 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5607ms | 20.0367μs | 49.9083 KOps/s | 50.2563 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1355ms | 45.5639μs | 21.9472 KOps/s | 23.2529 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 57.5480μs | 19.0411μs | 52.5178 KOps/s | 53.0936 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1036ms | 47.1113μs | 21.2263 KOps/s | 22.9501 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 67.9180μs | 18.8082μs | 53.1684 KOps/s | 53.1754 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1354ms | 55.0499μs | 18.1653 KOps/s | 19.6731 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0639ms | 20.1245μs | 49.6907 KOps/s | 50.3605 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1570ms | 46.4096μs | 21.5473 KOps/s | 23.0145 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 65.9240μs | 18.6845μs | 53.5202 KOps/s | 53.2914 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1522ms | 46.2260μs | 21.6328 KOps/s | 22.9859 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.5755ms | 18.8678μs | 53.0003 KOps/s | 53.0434 KOps/s | |
test_mod_add[eager] | 94.4570μs | 36.8455μs | 27.1404 KOps/s | 29.8416 KOps/s | |
test_mod_add[compile] | 99.8670μs | 49.8559μs | 20.0578 KOps/s | 20.8412 KOps/s | |
test_mod_add[compile-overhead] | 0.1357ms | 50.2008μs | 19.9200 KOps/s | 20.7435 KOps/s | |
test_mod_wrap[eager] | 0.4473ms | 0.2334ms | 4.2845 KOps/s | 4.4548 KOps/s | |
test_mod_wrap[compile] | 0.3487ms | 0.2114ms | 4.7304 KOps/s | 4.7903 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3884ms | 0.2098ms | 4.7665 KOps/s | 4.8886 KOps/s | |
test_mod_wrap_and_backward[eager] | 22.4251ms | 13.5377ms | 73.8677 Ops/s | 83.0659 Ops/s | |
test_mod_wrap_and_backward[compile] | 22.5372ms | 14.2634ms | 70.1097 Ops/s | 82.3573 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 15.3021ms | 12.6371ms | 79.1319 Ops/s | 72.1689 Ops/s | |
test_seq_add[eager] | 0.2914ms | 0.1201ms | 8.3268 KOps/s | 8.7889 KOps/s | |
test_seq_add[compile] | 0.1310ms | 64.9783μs | 15.3898 KOps/s | 16.0996 KOps/s | |
test_seq_add[compile-overhead] | 0.1404ms | 62.8732μs | 15.9050 KOps/s | 16.6964 KOps/s | |
test_seq_wrap[eager] | 0.6339ms | 0.4651ms | 2.1502 KOps/s | 2.2655 KOps/s | |
test_seq_wrap[compile] | 0.4797ms | 0.2369ms | 4.2215 KOps/s | 4.3921 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4058ms | 0.2362ms | 4.2345 KOps/s | 4.4192 KOps/s | |
test_func_call_runtime[False-eager] | 0.8034ms | 0.5599ms | 1.7859 KOps/s | 1.8637 KOps/s | |
test_func_call_runtime[False-compile] | 0.8874ms | 0.4474ms | 2.2350 KOps/s | 2.3437 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5714ms | 0.4363ms | 2.2920 KOps/s | 2.3395 KOps/s | |
test_func_call_runtime[True-eager] | 1.5906ms | 0.8049ms | 1.2424 KOps/s | 1.3138 KOps/s | |
test_func_call_runtime[True-compile] | 0.6362ms | 0.4814ms | 2.0775 KOps/s | 2.1385 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6165ms | 0.4825ms | 2.0725 KOps/s | 2.1404 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9684ms | 0.5686ms | 1.7587 KOps/s | 1.8389 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5519ms | 0.4386ms | 2.2799 KOps/s | 2.3421 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6015ms | 0.4352ms | 2.2977 KOps/s | 2.3479 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1695ms | 0.9301ms | 1.0752 KOps/s | 1.0866 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8407ms | 0.5093ms | 1.9635 KOps/s | 2.0281 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6378ms | 0.5050ms | 1.9802 KOps/s | 2.0373 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.3414ms | 2.0166ms | 495.8930 Ops/s | 516.2709 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9404ms | 0.5368ms | 1.8630 KOps/s | 1.8899 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.6429ms | 0.5307ms | 1.8844 KOps/s | 1.9149 KOps/s | |
test_distributed | 0.3771ms | 0.1260ms | 7.9393 KOps/s | 7.8254 KOps/s | |
test_tdmodule | 44.9340μs | 26.9087μs | 37.1628 KOps/s | 38.3178 KOps/s | |
test_tdmodule_dispatch | 82.9250μs | 50.0307μs | 19.9877 KOps/s | 19.2552 KOps/s | |
test_tdseq | 51.7570μs | 30.3095μs | 32.9930 KOps/s | 35.3183 KOps/s | |
test_tdseq_dispatch | 81.3530μs | 56.2087μs | 17.7909 KOps/s | 19.0982 KOps/s | |
test_instantiation_functorch | 3.2451ms | 1.5565ms | 642.4553 Ops/s | 657.5907 Ops/s | |
test_exec_functorch | 0.3565ms | 0.1835ms | 5.4509 KOps/s | 5.5725 KOps/s | |
test_exec_functional_call | 0.2902ms | 0.1746ms | 5.7277 KOps/s | 5.9685 KOps/s | |
test_exec_td_decorator | 0.5495ms | 0.2381ms | 4.2002 KOps/s | 4.3441 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8649ms | 0.6620ms | 1.5106 KOps/s | 1.5120 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 5.8787ms | 0.6685ms | 1.4959 KOps/s | 1.4905 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.0137ms | 0.5410ms | 1.8483 KOps/s | 1.8684 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8861ms | 0.5354ms | 1.8678 KOps/s | 1.8738 KOps/s | |
test_to_module_speed[True] | 1.9430ms | 1.3529ms | 739.1685 Ops/s | 738.1826 Ops/s | |
test_to_module_speed[False] | 1.7953ms | 1.3340ms | 749.6432 Ops/s | 762.2430 Ops/s | |
test_tc_init | 87.7650μs | 47.8508μs | 20.8983 KOps/s | 20.2492 KOps/s | |
test_tc_init_nested | 0.1683ms | 94.2051μs | 10.6151 KOps/s | 10.4695 KOps/s | |
test_tc_first_layer_tensor | 13.1650μs | 1.5503μs | 645.0163 KOps/s | 620.3138 KOps/s | |
test_tc_first_layer_nontensor | 34.0840μs | 4.7412μs | 210.9162 KOps/s | 209.9560 KOps/s | |
test_tc_second_layer_tensor | 32.2310μs | 2.8773μs | 347.5485 KOps/s | 336.3533 KOps/s | |
test_tc_second_layer_nontensor | 37.9710μs | 6.0931μs | 164.1213 KOps/s | 163.2234 KOps/s | |
test_unbind | 0.2322s | 14.7087ms | 67.9871 Ops/s | 74.5443 Ops/s | |
test_full_like | 11.1449ms | 9.3367ms | 107.1043 Ops/s | 82.8222 Ops/s | |
test_zeros_like | 4.0458ms | 3.4781ms | 287.5105 Ops/s | 134.4528 Ops/s | |
test_ones_like | 4.3997ms | 3.8858ms | 257.3488 Ops/s | 119.6149 Ops/s | |
test_clone | 9.1585ms | 5.7859ms | 172.8333 Ops/s | 97.7328 Ops/s | |
test_squeeze | 70.8930μs | 12.3729μs | 80.8219 KOps/s | 80.6264 KOps/s | |
test_unsqueeze | 0.1967ms | 94.9608μs | 10.5307 KOps/s | 10.6672 KOps/s | |
test_split | 0.5240ms | 0.1979ms | 5.0521 KOps/s | 5.0933 KOps/s | |
test_permute | 0.4082ms | 0.2131ms | 4.6916 KOps/s | 4.7196 KOps/s | |
test_stack | 28.3458ms | 24.0717ms | 41.5426 Ops/s | 36.9314 Ops/s | |
test_cat | 29.1371ms | 23.6444ms | 42.2934 Ops/s | 38.0193 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 36.4500μs | 11.3558μs | 88.0608 KOps/s | 76.9000 KOps/s | |
test_plain_set_stack_nested | 34.7110μs | 11.4686μs | 87.1944 KOps/s | 74.7964 KOps/s | |
test_plain_set_nested_inplace | 44.2710μs | 12.5857μs | 79.4551 KOps/s | 69.4498 KOps/s | |
test_plain_set_stack_nested_inplace | 43.8610μs | 12.5094μs | 79.9396 KOps/s | 69.2494 KOps/s | |
test_items | 37.1300μs | 2.8507μs | 350.7877 KOps/s | 346.2051 KOps/s | |
test_items_nested | 0.4137ms | 0.3539ms | 2.8253 KOps/s | 2.7584 KOps/s | |
test_items_nested_locked | 0.3947ms | 0.3575ms | 2.7972 KOps/s | 2.7601 KOps/s | |
test_items_nested_leaf | 79.8820μs | 58.0874μs | 17.2155 KOps/s | 16.9634 KOps/s | |
test_items_stack_nested | 0.4154ms | 0.3580ms | 2.7929 KOps/s | 2.7742 KOps/s | |
test_items_stack_nested_leaf | 86.4620μs | 59.7525μs | 16.7357 KOps/s | 16.7515 KOps/s | |
test_items_stack_nested_locked | 0.4082ms | 0.3593ms | 2.7828 KOps/s | 2.7632 KOps/s | |
test_keys | 36.9210μs | 3.4440μs | 290.3596 KOps/s | 289.5189 KOps/s | |
test_keys_nested | 0.1148ms | 82.5243μs | 12.1176 KOps/s | 11.9565 KOps/s | |
test_keys_nested_locked | 0.7744ms | 88.0926μs | 11.3517 KOps/s | 11.1141 KOps/s | |
test_keys_nested_leaf | 97.7920μs | 72.8485μs | 13.7271 KOps/s | 13.5217 KOps/s | |
test_keys_stack_nested | 0.1165ms | 84.0121μs | 11.9031 KOps/s | 11.7067 KOps/s | |
test_keys_stack_nested_leaf | 0.1112ms | 75.1910μs | 13.2995 KOps/s | 13.2321 KOps/s | |
test_keys_stack_nested_locked | 0.1390ms | 90.3628μs | 11.0665 KOps/s | 10.9628 KOps/s | |
test_values | 5.5868μs | 0.8479μs | 1.1794 MOps/s | 1.1742 MOps/s | |
test_values_nested | 57.1220μs | 34.1095μs | 29.3174 KOps/s | 28.8130 KOps/s | |
test_values_nested_locked | 58.0110μs | 35.5759μs | 28.1089 KOps/s | 27.5937 KOps/s | |
test_values_nested_leaf | 64.2410μs | 38.5986μs | 25.9076 KOps/s | 25.6502 KOps/s | |
test_values_stack_nested | 65.0810μs | 34.7936μs | 28.7409 KOps/s | 28.2639 KOps/s | |
test_values_stack_nested_leaf | 71.8910μs | 39.3061μs | 25.4414 KOps/s | 25.1332 KOps/s | |
test_values_stack_nested_locked | 66.9110μs | 36.2481μs | 27.5876 KOps/s | 27.0235 KOps/s | |
test_membership | 1.8090μs | 0.5066μs | 1.9738 MOps/s | 1.9865 MOps/s | |
test_membership_nested | 22.0755μs | 2.0019μs | 499.5249 KOps/s | 496.2401 KOps/s | |
test_membership_nested_leaf | 17.2205μs | 2.0111μs | 497.2368 KOps/s | 501.4104 KOps/s | |
test_membership_stacked_nested | 29.7710μs | 2.1176μs | 472.2414 KOps/s | 473.1745 KOps/s | |
test_membership_stacked_nested_leaf | 33.7810μs | 2.0983μs | 476.5803 KOps/s | 470.5908 KOps/s | |
test_membership_nested_last | 27.7410μs | 3.1257μs | 319.9289 KOps/s | 319.1881 KOps/s | |
test_membership_nested_leaf_last | 44.3710μs | 3.0674μs | 326.0037 KOps/s | 315.3605 KOps/s | |
test_membership_stacked_nested_last | 42.9410μs | 4.2987μs | 232.6282 KOps/s | 264.2891 KOps/s | |
test_membership_stacked_nested_leaf_last | 38.1700μs | 4.2660μs | 234.4135 KOps/s | 269.6416 KOps/s | |
test_nested_getleaf | 25.0010μs | 6.2616μs | 159.7034 KOps/s | 160.0593 KOps/s | |
test_nested_get | 32.7510μs | 5.8615μs | 170.6048 KOps/s | 170.8117 KOps/s | |
test_stacked_getleaf | 28.1610μs | 6.1612μs | 162.3054 KOps/s | 161.4050 KOps/s | |
test_stacked_get | 39.1310μs | 5.8676μs | 170.4286 KOps/s | 171.2666 KOps/s | |
test_nested_getitemleaf | 26.8400μs | 6.2606μs | 159.7280 KOps/s | 157.6464 KOps/s | |
test_nested_getitem | 29.4810μs | 5.9110μs | 169.1756 KOps/s | 165.8801 KOps/s | |
test_stacked_getitemleaf | 31.3200μs | 6.2431μs | 160.1773 KOps/s | 159.8281 KOps/s | |
test_stacked_getitem | 28.6000μs | 5.9203μs | 168.9106 KOps/s | 167.2234 KOps/s | |
test_lock_nested | 9.1786ms | 0.3887ms | 2.5724 KOps/s | 2.6062 KOps/s | |
test_lock_stack_nested | 0.4024ms | 0.3483ms | 2.8712 KOps/s | 2.7934 KOps/s | |
test_unlock_nested | 0.7284ms | 0.3209ms | 3.1159 KOps/s | 3.0679 KOps/s | |
test_unlock_stack_nested | 0.3453ms | 0.2864ms | 3.4912 KOps/s | 3.4089 KOps/s | |
test_flatten_speed | 0.1321ms | 75.8709μs | 13.1803 KOps/s | 13.1941 KOps/s | |
test_unflatten_speed | 0.3707ms | 0.3224ms | 3.1021 KOps/s | 3.0476 KOps/s | |
test_common_ops | 92.5933ms | 0.6343ms | 1.5765 KOps/s | 1.5312 KOps/s | |
test_creation | 21.5210μs | 1.7240μs | 580.0603 KOps/s | 574.1353 KOps/s | |
test_creation_empty | 35.5700μs | 6.4745μs | 154.4520 KOps/s | 102.5737 KOps/s | |
test_creation_nested_1 | 19.8300μs | 8.3774μs | 119.3695 KOps/s | 87.1599 KOps/s | |
test_creation_nested_2 | 0.1175ms | 10.8855μs | 91.8649 KOps/s | 70.1057 KOps/s | |
test_clone | 88.6610μs | 10.3506μs | 96.6132 KOps/s | 91.2443 KOps/s | |
test_getitem[int] | 0.9843ms | 10.7888μs | 92.6885 KOps/s | 85.7663 KOps/s | |
test_getitem[slice_int] | 0.1028ms | 20.9990μs | 47.6214 KOps/s | 45.0288 KOps/s | |
test_getitem[range] | 0.1229ms | 36.4124μs | 27.4632 KOps/s | 26.7345 KOps/s | |
test_getitem[tuple] | 0.1029ms | 18.3883μs | 54.3824 KOps/s | 50.5092 KOps/s | |
test_getitem[list] | 0.2052ms | 32.2017μs | 31.0543 KOps/s | 30.1793 KOps/s | |
test_setitem_dim[int] | 39.7110μs | 18.3593μs | 54.4683 KOps/s | 53.2331 KOps/s | |
test_setitem_dim[slice_int] | 70.7710μs | 37.8523μs | 26.4185 KOps/s | 26.3013 KOps/s | |
test_setitem_dim[range] | 83.5020μs | 50.4730μs | 19.8126 KOps/s | 19.1274 KOps/s | |
test_setitem_dim[tuple] | 63.3310μs | 30.5977μs | 32.6822 KOps/s | 30.7832 KOps/s | |
test_setitem | 70.1720μs | 13.4603μs | 74.2928 KOps/s | 62.1920 KOps/s | |
test_set | 96.9020μs | 13.0624μs | 76.5555 KOps/s | 63.7627 KOps/s | |
test_set_shared | 1.5588ms | 0.1510ms | 6.6208 KOps/s | 6.4339 KOps/s | |
test_update | 0.5644ms | 14.7856μs | 67.6334 KOps/s | 50.9841 KOps/s | |
test_update_nested | 1.1266ms | 20.2293μs | 49.4333 KOps/s | 39.4320 KOps/s | |
test_update__nested | 72.7120μs | 24.5746μs | 40.6924 KOps/s | 37.9223 KOps/s | |
test_set_nested | 73.7410μs | 14.4379μs | 69.2623 KOps/s | 59.7168 KOps/s | |
test_set_nested_new | 84.0610μs | 16.9422μs | 59.0242 KOps/s | 52.2162 KOps/s | |
test_select | 84.4020μs | 28.7850μs | 34.7403 KOps/s | 32.0038 KOps/s | |
test_select_nested | 68.8820μs | 45.0094μs | 22.2176 KOps/s | 22.1125 KOps/s | |
test_exclude_nested | 90.5120μs | 62.5920μs | 15.9765 KOps/s | 15.7258 KOps/s | |
test_empty[True] | 0.3242ms | 0.2915ms | 3.4309 KOps/s | 3.4327 KOps/s | |
test_empty[False] | 8.4431μs | 0.8286μs | 1.2069 MOps/s | 1.1827 MOps/s | |
test_to | 86.0020μs | 56.6783μs | 17.6434 KOps/s | 17.4669 KOps/s | |
test_to_nonblocking | 80.1520μs | 47.3804μs | 21.1058 KOps/s | 19.3275 KOps/s | |
test_unbind_speed | 1.3581ms | 0.2410ms | 4.1494 KOps/s | 4.0301 KOps/s | |
test_unbind_speed_stack0 | 0.2739ms | 0.2428ms | 4.1178 KOps/s | 4.0081 KOps/s | |
test_unbind_speed_stack1 | 92.3831ms | 0.6651ms | 1.5036 KOps/s | 1.4641 KOps/s | |
test_split | 93.2546ms | 1.5927ms | 627.8688 Ops/s | 593.2662 Ops/s | |
test_chunk | 95.5949ms | 1.6186ms | 617.8090 Ops/s | 596.3680 Ops/s | |
test_consolidate[False-None] | 95.8422ms | 2.9972ms | 333.6476 Ops/s | 318.8718 Ops/s | |
test_consolidate[default-None] | 1.9223ms | 1.6891ms | 592.0276 Ops/s | 565.0876 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8497ms | 1.7271ms | 579.0140 Ops/s | 548.8652 Ops/s | |
test_consolidate_njt[False-None] | 7.5035ms | 6.8714ms | 145.5303 Ops/s | 145.9922 Ops/s | |
test_to[False-False-None] | 2.1966ms | 1.7315ms | 577.5344 Ops/s | 554.3052 Ops/s | |
test_to[True-False-None] | 1.6149ms | 1.3701ms | 729.8822 Ops/s | 715.0407 Ops/s | |
test_to[within-False-None] | 4.4118ms | 4.1696ms | 239.8299 Ops/s | 227.2525 Ops/s | |
test_to[True-default-None] | 5.9816ms | 5.7196ms | 174.8374 Ops/s | 181.4840 Ops/s | |
test_to_njt[False-False-None] | 7.4593ms | 7.0543ms | 141.7576 Ops/s | 141.3078 Ops/s | |
test_to_njt[True-False-None] | 6.5128ms | 5.8143ms | 171.9911 Ops/s | 174.7113 Ops/s | |
test_to_njt[within-False-None] | 13.3855ms | 13.1587ms | 75.9951 Ops/s | 79.0094 Ops/s | |
test_creation[device0] | 0.5525ms | 85.7165μs | 11.6664 KOps/s | 12.4050 KOps/s | |
test_creation_from_tensor | 0.5386ms | 87.3424μs | 11.4492 KOps/s | 11.9036 KOps/s | |
test_add_one[memmap_tensor0] | 0.2387ms | 6.3427μs | 157.6623 KOps/s | 152.7964 KOps/s | |
test_contiguous[memmap_tensor0] | 2.1505μs | 0.3982μs | 2.5111 MOps/s | 2.4950 MOps/s | |
test_stack[memmap_tensor0] | 30.1710μs | 4.5478μs | 219.8881 KOps/s | 200.7368 KOps/s | |
test_memmaptd_index | 1.3982ms | 0.2506ms | 3.9912 KOps/s | 3.6932 KOps/s | |
test_memmaptd_index_astensor | 0.5691ms | 0.3090ms | 3.2368 KOps/s | 2.9564 KOps/s | |
test_memmaptd_index_op | 0.9965ms | 0.5464ms | 1.8300 KOps/s | 1.6002 KOps/s | |
test_serialize_model | 0.1318s | 0.1311s | 7.6279 Ops/s | 7.5810 Ops/s | |
test_serialize_model_pickle | 1.3462s | 1.2129s | 0.8245 Ops/s | 0.8260 Ops/s | |
test_serialize_weights | 0.4228s | 0.1725s | 5.7965 Ops/s | 7.6057 Ops/s | |
test_serialize_weights_returnearly | 0.3270s | 56.0382ms | 17.8450 Ops/s | 22.9116 Ops/s | |
test_serialize_weights_pickle | 1.3789s | 1.2266s | 0.8152 Ops/s | 0.8401 Ops/s | |
test_reshape_pytree | 68.4510μs | 23.3998μs | 42.7353 KOps/s | 43.4002 KOps/s | |
test_reshape_td | 78.3120μs | 31.4405μs | 31.8061 KOps/s | 35.4129 KOps/s | |
test_view_pytree | 67.3320μs | 23.8026μs | 42.0122 KOps/s | 44.2706 KOps/s | |
test_view_td | 76.7310μs | 34.8799μs | 28.6698 KOps/s | 31.3985 KOps/s | |
test_unbind_pytree | 58.1610μs | 29.7103μs | 33.6584 KOps/s | 34.4897 KOps/s | |
test_unbind_td | 0.7799ms | 38.9448μs | 25.6774 KOps/s | 26.4355 KOps/s | |
test_split_pytree | 62.7410μs | 32.8287μs | 30.4612 KOps/s | 31.5781 KOps/s | |
test_split_td | 0.9033ms | 42.0729μs | 23.7683 KOps/s | 23.6284 KOps/s | |
test_add_pytree | 73.6510μs | 34.5845μs | 28.9147 KOps/s | 29.1753 KOps/s | |
test_add_td | 85.5210μs | 45.0435μs | 22.2008 KOps/s | 19.7494 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1749ms | 0.1215ms | 8.2326 KOps/s | 7.4296 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2252ms | 0.1312ms | 7.6191 KOps/s | 7.1241 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1423ms | 97.4993μs | 10.2565 KOps/s | 9.6458 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.4158ms | 0.1493ms | 6.7000 KOps/s | 6.1575 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 52.5010μs | 23.6029μs | 42.3677 KOps/s | 44.5295 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 60.6610μs | 29.2167μs | 34.2270 KOps/s | 34.0333 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3153ms | 64.5449μs | 15.4931 KOps/s | 15.3326 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1192ms | 48.5559μs | 20.5948 KOps/s | 20.0716 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2150ms | 0.1468ms | 6.8142 KOps/s | 6.8384 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3096ms | 0.2146ms | 4.6600 KOps/s | 4.6512 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1614ms | 98.3378μs | 10.1690 KOps/s | 10.0243 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1095ms | 52.1866μs | 19.1620 KOps/s | 18.8257 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2019ms | 0.1365ms | 7.3253 KOps/s | 7.1256 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5292ms | 0.4750ms | 2.1052 KOps/s | 2.0027 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3649ms | 0.2588ms | 3.8635 KOps/s | 3.8750 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1843ms | 0.1438ms | 6.9520 KOps/s | 6.8862 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1451ms | 64.1698μs | 15.5836 KOps/s | 15.3764 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1376ms | 99.2934μs | 10.0712 KOps/s | 10.0285 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4551ms | 0.4072ms | 2.4555 KOps/s | 2.3480 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1861ms | 0.1394ms | 7.1743 KOps/s | 7.2401 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 57.0210μs | 19.2837μs | 51.8572 KOps/s | 55.0735 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 63.3810μs | 32.0126μs | 31.2377 KOps/s | 31.8030 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1449ms | 71.0364μs | 14.0773 KOps/s | 13.9710 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1134ms | 52.5160μs | 19.0418 KOps/s | 18.8360 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6182ms | 0.3894ms | 2.5681 KOps/s | 2.1823 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.6378ms | 2.5687ms | 389.3041 Ops/s | 350.2480 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5921ms | 0.3808ms | 2.6262 KOps/s | 2.2414 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.8579ms | 2.6273ms | 380.6175 Ops/s | 366.1222 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1652ms | 0.1128ms | 8.8655 KOps/s | 8.3925 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5588ms | 82.6665μs | 12.0968 KOps/s | 12.3067 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1669ms | 0.1143ms | 8.7500 KOps/s | 9.3946 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1182ms | 71.9232μs | 13.9037 KOps/s | 14.3003 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1675ms | 0.1143ms | 8.7515 KOps/s | 8.9058 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1553ms | 68.3241μs | 14.6361 KOps/s | 14.5792 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1575ms | 0.1016ms | 9.8460 KOps/s | 9.7309 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1390ms | 17.6077μs | 56.7932 KOps/s | 49.3673 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1440ms | 97.2085μs | 10.2872 KOps/s | 10.1311 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 58.3810μs | 15.8815μs | 62.9665 KOps/s | 60.6739 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1541ms | 0.1008ms | 9.9246 KOps/s | 10.0258 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 49.1410μs | 16.0135μs | 62.4474 KOps/s | 59.3583 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1650ms | 0.1088ms | 9.1880 KOps/s | 9.6119 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5992ms | 18.5970μs | 53.7720 KOps/s | 53.9282 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1546ms | 0.1063ms | 9.4116 KOps/s | 9.9945 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 38.9810μs | 17.1412μs | 58.3390 KOps/s | 60.0756 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1512ms | 0.1042ms | 9.6005 KOps/s | 9.9726 KOps/s | |
test_compile_indexing[int-pytree-eager] | 57.3010μs | 17.5552μs | 56.9631 KOps/s | 60.1543 KOps/s | |
test_mod_add[eager] | 84.6320μs | 36.5466μs | 27.3624 KOps/s | 25.8516 KOps/s | |
test_mod_add[compile] | 0.1157ms | 81.3666μs | 12.2901 KOps/s | 12.0922 KOps/s | |
test_mod_add[compile-overhead] | 0.3328ms | 0.1690ms | 5.9174 KOps/s | 5.5289 KOps/s | |
test_mod_wrap[eager] | 0.3335ms | 0.2422ms | 4.1289 KOps/s | 3.9947 KOps/s | |
test_mod_wrap[compile] | 0.3519ms | 0.2832ms | 3.5308 KOps/s | 3.4635 KOps/s | |
test_mod_wrap[compile-overhead] | 7.0198ms | 3.7440ms | 267.0957 Ops/s | 271.4366 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4322ms | 1.3278ms | 753.1200 Ops/s | 693.8119 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.6717ms | 1.2611ms | 792.9655 Ops/s | 714.8099 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3588ms | 0.9300ms | 1.0753 KOps/s | 960.8276 Ops/s | |
test_seq_add[eager] | 0.1804ms | 0.1174ms | 8.5182 KOps/s | 8.3718 KOps/s | |
test_seq_add[compile] | 0.2229ms | 92.6358μs | 10.7950 KOps/s | 11.0663 KOps/s | |
test_seq_add[compile-overhead] | 0.1718ms | 0.1306ms | 7.6588 KOps/s | 7.4920 KOps/s | |
test_seq_wrap[eager] | 0.5385ms | 0.4310ms | 2.3199 KOps/s | 2.3250 KOps/s | |
test_seq_wrap[compile] | 0.3975ms | 0.2990ms | 3.3448 KOps/s | 3.1797 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2922ms | 0.2255ms | 4.4339 KOps/s | 4.3925 KOps/s | |
test_func_call_runtime[False-eager] | 0.7593ms | 0.7145ms | 1.3996 KOps/s | 1.3106 KOps/s | |
test_func_call_runtime[False-compile] | 0.9726ms | 0.7413ms | 1.3489 KOps/s | 1.3161 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4660ms | 0.3674ms | 2.7220 KOps/s | 2.7340 KOps/s | |
test_func_call_runtime[True-eager] | 0.9273ms | 0.8734ms | 1.1450 KOps/s | 1.1084 KOps/s | |
test_func_call_runtime[True-compile] | 0.9921ms | 0.7647ms | 1.3077 KOps/s | 1.2806 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5036ms | 0.3872ms | 2.5827 KOps/s | 2.6006 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8479ms | 0.7698ms | 1.2990 KOps/s | 1.3873 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8064ms | 0.7449ms | 1.3424 KOps/s | 1.3113 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4137ms | 0.3658ms | 2.7339 KOps/s | 2.7357 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2142ms | 1.0326ms | 968.4758 Ops/s | 997.3958 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.0435ms | 0.7882ms | 1.2688 KOps/s | 1.2313 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4866ms | 0.4097ms | 2.4409 KOps/s | 2.4213 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5390ms | 2.0431ms | 489.4572 Ops/s | 489.5119 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8766ms | 0.7993ms | 1.2511 KOps/s | 1.2070 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4879ms | 0.4134ms | 2.4189 KOps/s | 2.4067 KOps/s | |
test_distributed | 4.2159ms | 0.1779ms | 5.6218 KOps/s | 8.3170 KOps/s | |
test_tdmodule | 48.1110μs | 18.6723μs | 53.5552 KOps/s | 47.5289 KOps/s | |
test_tdmodule_dispatch | 66.2310μs | 33.6452μs | 29.7219 KOps/s | 26.0775 KOps/s | |
test_tdseq | 48.9110μs | 19.8365μs | 50.4122 KOps/s | 44.3841 KOps/s | |
test_tdseq_dispatch | 66.7920μs | 36.8634μs | 27.1272 KOps/s | 23.9029 KOps/s | |
test_instantiation_functorch | 1.6662ms | 1.5607ms | 640.7190 Ops/s | 626.1298 Ops/s | |
test_exec_functorch | 0.2026ms | 0.1433ms | 6.9784 KOps/s | 7.0126 KOps/s | |
test_exec_functional_call | 0.1873ms | 0.1350ms | 7.4065 KOps/s | 7.4140 KOps/s | |
test_exec_td_decorator | 0.3791ms | 0.1842ms | 5.4282 KOps/s | 5.3907 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7819ms | 0.6724ms | 1.4872 KOps/s | 1.4184 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8033ms | 0.6761ms | 1.4792 KOps/s | 1.4087 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7103ms | 0.5999ms | 1.6669 KOps/s | 1.6449 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7376ms | 0.6014ms | 1.6627 KOps/s | 1.6289 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.1489ms | 19.2757ms | 51.8788 Ops/s | 52.8742 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.0146ms | 19.2249ms | 52.0158 Ops/s | 52.8257 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.8753ms | 18.8729ms | 52.9860 Ops/s | 53.5567 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.5490ms | 18.7192ms | 53.4212 Ops/s | 53.2767 Ops/s | |
test_to_module_speed[True] | 1.0713ms | 0.9781ms | 1.0223 KOps/s | 989.3886 Ops/s | |
test_to_module_speed[False] | 1.1040ms | 0.9583ms | 1.0435 KOps/s | 1.0224 KOps/s | |
test_tc_init | 54.6310μs | 34.1426μs | 29.2889 KOps/s | 24.8283 KOps/s | |
test_tc_init_nested | 0.1028ms | 69.7310μs | 14.3408 KOps/s | 11.9598 KOps/s | |
test_tc_first_layer_tensor | 32.3100μs | 0.8328μs | 1.2008 MOps/s | 1.1817 MOps/s | |
test_tc_first_layer_nontensor | 23.6700μs | 2.2541μs | 443.6354 KOps/s | 437.5243 KOps/s | |
test_tc_second_layer_tensor | 22.5855μs | 1.4045μs | 711.9792 KOps/s | 644.6392 KOps/s | |
test_tc_second_layer_nontensor | 38.8210μs | 3.0115μs | 332.0595 KOps/s | 331.1266 KOps/s | |
test_unbind | 0.2195s | 11.6751ms | 85.6522 Ops/s | 139.7307 Ops/s | |
test_full_like | 9.4380ms | 9.1173ms | 109.6818 Ops/s | 108.7795 Ops/s | |
test_zeros_like | 5.3904ms | 4.3279ms | 231.0610 Ops/s | 234.0943 Ops/s | |
test_ones_like | 4.9204ms | 4.2151ms | 237.2421 Ops/s | 236.4597 Ops/s | |
test_clone | 6.5047ms | 6.3529ms | 157.4075 Ops/s | 109.5662 Ops/s | |
test_squeeze | 59.2510μs | 10.0455μs | 99.5475 KOps/s | 97.5498 KOps/s | |
test_unsqueeze | 0.1236ms | 71.7382μs | 13.9396 KOps/s | 12.8491 KOps/s | |
test_split | 0.3952ms | 0.1776ms | 5.6293 KOps/s | 5.7615 KOps/s | |
test_permute | 0.2421ms | 0.1851ms | 5.4023 KOps/s | 5.3762 KOps/s | |
test_stack | 50.6066ms | 50.3276ms | 19.8698 Ops/s | 19.7511 Ops/s | |
test_cat | 50.5404ms | 50.3180ms | 19.8736 Ops/s | 19.7697 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):