-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] TensorDict.logsumexp #1162
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 7, 2025
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 50.4340μs | 20.4703μs | 48.8513 KOps/s | 49.3359 KOps/s | |
test_plain_set_stack_nested | 71.9850μs | 20.7550μs | 48.1811 KOps/s | 48.9888 KOps/s | |
test_plain_set_nested_inplace | 70.0810μs | 22.3978μs | 44.6473 KOps/s | 45.3499 KOps/s | |
test_plain_set_stack_nested_inplace | 62.7380μs | 22.3389μs | 44.7649 KOps/s | 45.3778 KOps/s | |
test_items | 41.9590μs | 4.2437μs | 235.6446 KOps/s | 233.6879 KOps/s | |
test_items_nested | 0.5891ms | 0.4059ms | 2.4637 KOps/s | 2.5035 KOps/s | |
test_items_nested_locked | 0.5983ms | 0.4044ms | 2.4726 KOps/s | 2.4820 KOps/s | |
test_items_nested_leaf | 0.1444ms | 75.6109μs | 13.2256 KOps/s | 13.0567 KOps/s | |
test_items_stack_nested | 0.5601ms | 0.4084ms | 2.4485 KOps/s | 2.3384 KOps/s | |
test_items_stack_nested_leaf | 0.1514ms | 79.9433μs | 12.5089 KOps/s | 12.9124 KOps/s | |
test_items_stack_nested_locked | 0.6408ms | 0.4082ms | 2.4499 KOps/s | 2.4772 KOps/s | |
test_keys | 35.6260μs | 3.6883μs | 271.1303 KOps/s | 286.7006 KOps/s | |
test_keys_nested | 0.2694ms | 0.1654ms | 6.0442 KOps/s | 6.1159 KOps/s | |
test_keys_nested_locked | 0.8289ms | 0.1712ms | 5.8422 KOps/s | 5.8866 KOps/s | |
test_keys_nested_leaf | 1.9405ms | 0.1434ms | 6.9749 KOps/s | 7.0659 KOps/s | |
test_keys_stack_nested | 0.2713ms | 0.1621ms | 6.1681 KOps/s | 6.1285 KOps/s | |
test_keys_stack_nested_leaf | 0.2706ms | 0.1411ms | 7.0857 KOps/s | 7.0084 KOps/s | |
test_keys_stack_nested_locked | 0.3015ms | 0.1680ms | 5.9532 KOps/s | 5.8985 KOps/s | |
test_values | 8.3776μs | 1.0678μs | 936.5387 KOps/s | 970.4972 KOps/s | |
test_values_nested | 0.1156ms | 63.2229μs | 15.8170 KOps/s | 16.1697 KOps/s | |
test_values_nested_locked | 0.1278ms | 63.2148μs | 15.8191 KOps/s | 15.7226 KOps/s | |
test_values_nested_leaf | 0.1296ms | 71.3069μs | 14.0239 KOps/s | 13.9983 KOps/s | |
test_values_stack_nested | 0.1093ms | 63.0715μs | 15.8550 KOps/s | 16.1473 KOps/s | |
test_values_stack_nested_leaf | 0.1370ms | 71.6848μs | 13.9500 KOps/s | 13.8904 KOps/s | |
test_values_stack_nested_locked | 0.1279ms | 63.0124μs | 15.8699 KOps/s | 16.0729 KOps/s | |
test_membership | 90.2490μs | 0.9374μs | 1.0667 MOps/s | 1.0763 MOps/s | |
test_membership_nested | 24.0950μs | 2.9700μs | 336.7004 KOps/s | 343.3502 KOps/s | |
test_membership_nested_leaf | 34.1940μs | 3.0246μs | 330.6174 KOps/s | 335.2337 KOps/s | |
test_membership_stacked_nested | 27.2910μs | 2.9329μs | 340.9614 KOps/s | 345.4431 KOps/s | |
test_membership_stacked_nested_leaf | 37.8210μs | 2.9459μs | 339.4570 KOps/s | 347.1989 KOps/s | |
test_membership_nested_last | 63.4290μs | 4.4898μs | 222.7277 KOps/s | 232.4412 KOps/s | |
test_membership_nested_leaf_last | 34.1140μs | 4.5082μs | 221.8201 KOps/s | 233.4658 KOps/s | |
test_membership_stacked_nested_last | 34.6050μs | 7.0489μs | 141.8656 KOps/s | 231.0250 KOps/s | |
test_membership_stacked_nested_leaf_last | 34.9260μs | 7.0732μs | 141.3782 KOps/s | 230.8089 KOps/s | |
test_nested_getleaf | 0.1080ms | 11.0052μs | 90.8665 KOps/s | 91.9170 KOps/s | |
test_nested_get | 58.2590μs | 10.4240μs | 95.9326 KOps/s | 96.2426 KOps/s | |
test_stacked_getleaf | 39.7850μs | 10.9492μs | 91.3309 KOps/s | 92.1377 KOps/s | |
test_stacked_get | 35.8270μs | 10.4723μs | 95.4904 KOps/s | 96.5395 KOps/s | |
test_nested_getitemleaf | 45.3750μs | 11.5202μs | 86.8037 KOps/s | 90.1878 KOps/s | |
test_nested_getitem | 43.4510μs | 10.8116μs | 92.4931 KOps/s | 96.2714 KOps/s | |
test_stacked_getitemleaf | 67.8570μs | 11.3644μs | 87.9940 KOps/s | 90.6273 KOps/s | |
test_stacked_getitem | 52.8700μs | 10.8774μs | 91.9340 KOps/s | 95.2799 KOps/s | |
test_lock_nested | 1.9022ms | 0.4659ms | 2.1464 KOps/s | 2.1805 KOps/s | |
test_lock_stack_nested | 0.9231ms | 0.4322ms | 2.3138 KOps/s | 2.3233 KOps/s | |
test_unlock_nested | 0.9308ms | 0.3839ms | 2.6052 KOps/s | 2.6004 KOps/s | |
test_unlock_stack_nested | 0.4515ms | 0.3492ms | 2.8636 KOps/s | 2.8578 KOps/s | |
test_flatten_speed | 0.1775ms | 0.1005ms | 9.9529 KOps/s | 10.1095 KOps/s | |
test_unflatten_speed | 0.6379ms | 0.5319ms | 1.8801 KOps/s | 1.9262 KOps/s | |
test_common_ops | 4.0306ms | 0.8343ms | 1.1985 KOps/s | 1.2836 KOps/s | |
test_creation | 39.5050μs | 2.5197μs | 396.8779 KOps/s | 409.9017 KOps/s | |
test_creation_empty | 36.2070μs | 11.4103μs | 87.6399 KOps/s | 97.3942 KOps/s | |
test_creation_nested_1 | 1.3290ms | 14.1622μs | 70.6104 KOps/s | 75.0557 KOps/s | |
test_creation_nested_2 | 44.3940μs | 18.8613μs | 53.0187 KOps/s | 55.3232 KOps/s | |
test_clone | 0.3547ms | 13.9531μs | 71.6686 KOps/s | 72.0804 KOps/s | |
test_getitem[int] | 0.7651ms | 12.9733μs | 77.0815 KOps/s | 75.9413 KOps/s | |
test_getitem[slice_int] | 0.1641ms | 24.5352μs | 40.7577 KOps/s | 39.6311 KOps/s | |
test_getitem[range] | 0.2064ms | 48.6305μs | 20.5632 KOps/s | 20.3704 KOps/s | |
test_getitem[tuple] | 0.1350ms | 20.0197μs | 49.9507 KOps/s | 48.5602 KOps/s | |
test_getitem[list] | 0.9478ms | 44.9454μs | 22.2492 KOps/s | 22.3017 KOps/s | |
test_setitem_dim[int] | 51.3160μs | 26.9956μs | 37.0431 KOps/s | 37.0793 KOps/s | |
test_setitem_dim[slice_int] | 0.1061ms | 55.2237μs | 18.1082 KOps/s | 19.3797 KOps/s | |
test_setitem_dim[range] | 0.1311ms | 75.4050μs | 13.2617 KOps/s | 13.6884 KOps/s | |
test_setitem_dim[tuple] | 75.1410μs | 41.8945μs | 23.8695 KOps/s | 24.7841 KOps/s | |
test_setitem | 0.3708ms | 21.1655μs | 47.2467 KOps/s | 49.6354 KOps/s | |
test_set | 0.4471ms | 20.5694μs | 48.6158 KOps/s | 50.1295 KOps/s | |
test_set_shared | 1.4198ms | 0.1805ms | 5.5396 KOps/s | 5.8032 KOps/s | |
test_update | 0.3904ms | 23.6808μs | 42.2282 KOps/s | 45.5893 KOps/s | |
test_update_nested | 0.4018ms | 33.6478μs | 29.7196 KOps/s | 31.1285 KOps/s | |
test_update__nested | 0.5997ms | 33.8521μs | 29.5403 KOps/s | 29.8669 KOps/s | |
test_set_nested | 0.3881ms | 22.8092μs | 43.8420 KOps/s | 45.7666 KOps/s | |
test_set_nested_new | 0.3940ms | 27.4287μs | 36.4581 KOps/s | 37.6232 KOps/s | |
test_select | 0.3885ms | 44.6601μs | 22.3914 KOps/s | 23.5763 KOps/s | |
test_select_nested | 0.1249ms | 63.2436μs | 15.8119 KOps/s | 15.8704 KOps/s | |
test_exclude_nested | 0.1503ms | 82.1797μs | 12.1685 KOps/s | 12.2927 KOps/s | |
test_empty[True] | 0.6855ms | 0.4133ms | 2.4193 KOps/s | 2.4537 KOps/s | |
test_empty[False] | 11.5517μs | 1.4180μs | 705.2149 KOps/s | 724.9780 KOps/s | |
test_unbind_speed | 0.4113ms | 0.2707ms | 3.6942 KOps/s | 3.5704 KOps/s | |
test_unbind_speed_stack0 | 0.4734ms | 0.2655ms | 3.7662 KOps/s | 3.7134 KOps/s | |
test_unbind_speed_stack1 | 0.1199s | 0.8203ms | 1.2190 KOps/s | 1.3253 KOps/s | |
test_split | 0.1183s | 1.7918ms | 558.1097 Ops/s | 565.2210 Ops/s | |
test_chunk | 0.1195s | 1.8082ms | 553.0427 Ops/s | 560.0776 Ops/s | |
test_consolidate_njt[False-None] | 9.1403ms | 8.4921ms | 117.7564 Ops/s | 118.2509 Ops/s | |
test_creation[device0] | 0.2431ms | 93.3784μs | 10.7091 KOps/s | 10.7414 KOps/s | |
test_creation_from_tensor | 3.4649ms | 95.8525μs | 10.4327 KOps/s | 10.2171 KOps/s | |
test_add_one[memmap_tensor0] | 0.4983ms | 4.9439μs | 202.2687 KOps/s | 205.9986 KOps/s | |
test_contiguous[memmap_tensor0] | 13.8550μs | 0.5394μs | 1.8539 MOps/s | 1.9377 MOps/s | |
test_stack[memmap_tensor0] | 0.1726ms | 3.4805μs | 287.3183 KOps/s | 300.2551 KOps/s | |
test_memmaptd_index | 1.1262ms | 0.2416ms | 4.1383 KOps/s | 3.9818 KOps/s | |
test_memmaptd_index_astensor | 0.7206ms | 0.3332ms | 3.0008 KOps/s | 2.9628 KOps/s | |
test_memmaptd_index_op | 1.1251ms | 0.6152ms | 1.6254 KOps/s | 1.7065 KOps/s | |
test_serialize_model | 0.1359s | 0.1224s | 8.1678 Ops/s | 7.1137 Ops/s | |
test_serialize_model_pickle | 0.4585s | 0.4052s | 2.4680 Ops/s | 2.5573 Ops/s | |
test_serialize_weights | 0.1220s | 0.1168s | 8.5647 Ops/s | 8.4934 Ops/s | |
test_serialize_weights_returnearly | 0.1621s | 0.1583s | 6.3153 Ops/s | 6.3165 Ops/s | |
test_serialize_weights_pickle | 0.4725s | 0.4084s | 2.4483 Ops/s | 2.5275 Ops/s | |
test_serialize_weights_filesystem | 0.1537s | 0.1494s | 6.6956 Ops/s | 6.2451 Ops/s | |
test_serialize_model_filesystem | 0.1651s | 0.1564s | 6.3942 Ops/s | 6.5506 Ops/s | |
test_reshape_pytree | 65.9640μs | 25.9567μs | 38.5257 KOps/s | 37.1023 KOps/s | |
test_reshape_td | 73.8480μs | 33.0534μs | 30.2541 KOps/s | 29.5653 KOps/s | |
test_view_pytree | 0.1119ms | 26.4459μs | 37.8131 KOps/s | 36.5494 KOps/s | |
test_view_td | 0.1058ms | 39.7578μs | 25.1523 KOps/s | 23.7717 KOps/s | |
test_unbind_pytree | 65.8740μs | 29.2884μs | 34.1432 KOps/s | 33.0713 KOps/s | |
test_unbind_td | 0.3726ms | 39.7241μs | 25.1736 KOps/s | 24.4951 KOps/s | |
test_split_pytree | 72.5670μs | 29.3769μs | 34.0403 KOps/s | 33.7233 KOps/s | |
test_split_td | 0.5821ms | 44.9875μs | 22.2284 KOps/s | 22.2477 KOps/s | |
test_add_pytree | 0.1097ms | 34.8597μs | 28.6864 KOps/s | 27.9083 KOps/s | |
test_add_td | 0.1224ms | 57.5083μs | 17.3888 KOps/s | 17.7053 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1398ms | 62.2987μs | 16.0517 KOps/s | 15.7510 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4260ms | 0.1677ms | 5.9630 KOps/s | 5.7960 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1171ms | 45.4652μs | 21.9948 KOps/s | 21.3580 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2316ms | 0.1188ms | 8.4188 KOps/s | 8.2771 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 97.7210μs | 26.0365μs | 38.4076 KOps/s | 37.1172 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1095ms | 58.8363μs | 16.9963 KOps/s | 16.9170 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1958ms | 80.5254μs | 12.4184 KOps/s | 12.5442 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1322ms | 68.6821μs | 14.5598 KOps/s | 14.4788 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2124ms | 0.1041ms | 9.6063 KOps/s | 9.3012 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3912ms | 0.2111ms | 4.7378 KOps/s | 4.6495 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1230ms | 43.8671μs | 22.7961 KOps/s | 21.4819 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5664ms | 63.4410μs | 15.7627 KOps/s | 14.3758 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1860ms | 0.1033ms | 9.6792 KOps/s | 9.4717 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3336ms | 0.1983ms | 5.0422 KOps/s | 4.8421 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3630ms | 0.2282ms | 4.3830 KOps/s | 4.3037 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2295ms | 0.1058ms | 9.4502 KOps/s | 9.2731 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1458ms | 56.7768μs | 17.6128 KOps/s | 17.1374 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1404ms | 46.0671μs | 21.7075 KOps/s | 21.4310 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6607ms | 0.1606ms | 6.2266 KOps/s | 6.1915 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2020ms | 0.1029ms | 9.7196 KOps/s | 9.1673 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 77.0840μs | 21.1570μs | 47.2656 KOps/s | 46.6406 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1326ms | 65.8634μs | 15.1829 KOps/s | 14.9418 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1509ms | 81.6367μs | 12.2494 KOps/s | 11.9518 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1349ms | 68.7606μs | 14.5432 KOps/s | 14.3154 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4154ms | 0.2044ms | 4.8933 KOps/s | 4.5875 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.0685ms | 1.2980ms | 770.3903 Ops/s | 759.5388 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4026ms | 0.2031ms | 4.9227 KOps/s | 4.8005 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.9660ms | 0.7694ms | 1.2997 KOps/s | 1.2603 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8046ms | 0.4525ms | 2.2097 KOps/s | 2.1247 KOps/s | |
test_compile_assign_and_add_stack[eager] | 2.9676ms | 2.6565ms | 376.4364 Ops/s | 372.7363 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1320ms | 36.6255μs | 27.3033 KOps/s | 27.1747 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6567ms | 33.4535μs | 29.8922 KOps/s | 28.3626 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 91.5510μs | 29.0925μs | 34.3732 KOps/s | 32.8393 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1088ms | 23.0196μs | 43.4412 KOps/s | 41.5721 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 87.1330μs | 28.6970μs | 34.8469 KOps/s | 32.0809 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 85.9280μs | 22.8917μs | 43.6840 KOps/s | 42.0259 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1336ms | 50.8754μs | 19.6559 KOps/s | 19.1742 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.6796ms | 20.1305μs | 49.6759 KOps/s | 48.1774 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 96.9820μs | 43.7743μs | 22.8445 KOps/s | 22.4228 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 79.2680μs | 18.7443μs | 53.3496 KOps/s | 52.4578 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1229ms | 44.6937μs | 22.3745 KOps/s | 21.8632 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 2.2601ms | 18.8128μs | 53.1553 KOps/s | 52.4800 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1229ms | 52.3812μs | 19.0908 KOps/s | 18.6831 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0326ms | 20.0080μs | 49.9799 KOps/s | 48.4772 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1286ms | 44.8290μs | 22.3070 KOps/s | 21.8069 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 85.4400μs | 18.7280μs | 53.3960 KOps/s | 52.8610 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1294ms | 44.4885μs | 22.4777 KOps/s | 22.0389 KOps/s | |
test_compile_indexing[int-pytree-eager] | 54.1120μs | 18.9228μs | 52.8462 KOps/s | 52.7665 KOps/s | |
test_mod_add[eager] | 0.1076ms | 33.9195μs | 29.4816 KOps/s | 29.4878 KOps/s | |
test_mod_add[compile] | 0.1071ms | 46.9330μs | 21.3070 KOps/s | 20.7292 KOps/s | |
test_mod_add[compile-overhead] | 0.1232ms | 46.4324μs | 21.5367 KOps/s | 20.5773 KOps/s | |
test_mod_wrap[eager] | 0.3664ms | 0.2271ms | 4.4040 KOps/s | 4.3256 KOps/s | |
test_mod_wrap[compile] | 0.3778ms | 0.2099ms | 4.7646 KOps/s | 4.5985 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3312ms | 0.2052ms | 4.8729 KOps/s | 4.7202 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.7403ms | 11.2672ms | 88.7533 Ops/s | 79.7529 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.6312ms | 11.1992ms | 89.2920 Ops/s | 72.9906 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.4840ms | 10.9795ms | 91.0789 Ops/s | 73.3478 Ops/s | |
test_seq_add[eager] | 0.2578ms | 0.1168ms | 8.5621 KOps/s | 8.7431 KOps/s | |
test_seq_add[compile] | 0.1648ms | 62.3807μs | 16.0306 KOps/s | 15.6364 KOps/s | |
test_seq_add[compile-overhead] | 0.1781ms | 60.1429μs | 16.6271 KOps/s | 15.9882 KOps/s | |
test_seq_wrap[eager] | 0.6625ms | 0.4403ms | 2.2711 KOps/s | 2.1511 KOps/s | |
test_seq_wrap[compile] | 0.4621ms | 0.2336ms | 4.2804 KOps/s | 4.2052 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3551ms | 0.2274ms | 4.3966 KOps/s | 4.1933 KOps/s | |
test_func_call_runtime[False-eager] | 0.8229ms | 0.5497ms | 1.8191 KOps/s | 1.7570 KOps/s | |
test_func_call_runtime[False-compile] | 0.5572ms | 0.4311ms | 2.3195 KOps/s | 2.2871 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5378ms | 0.4300ms | 2.3255 KOps/s | 2.2661 KOps/s | |
test_func_call_runtime[True-eager] | 1.0708ms | 0.7636ms | 1.3096 KOps/s | 1.2791 KOps/s | |
test_func_call_runtime[True-compile] | 0.7853ms | 0.4686ms | 2.1340 KOps/s | 2.0899 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.7884ms | 0.4762ms | 2.0998 KOps/s | 2.0615 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.0559ms | 0.5486ms | 1.8228 KOps/s | 1.7729 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5316ms | 0.4305ms | 2.3231 KOps/s | 2.2550 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5468ms | 0.4317ms | 2.3164 KOps/s | 2.2787 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1840ms | 0.9039ms | 1.1063 KOps/s | 1.0671 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.7390ms | 0.4979ms | 2.0083 KOps/s | 2.0006 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8383ms | 0.4993ms | 2.0026 KOps/s | 1.9985 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.0998ms | 1.9568ms | 511.0473 Ops/s | 511.7675 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9784ms | 0.5252ms | 1.9041 KOps/s | 1.8894 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.9154ms | 0.5281ms | 1.8935 KOps/s | 1.9009 KOps/s | |
test_distributed | 0.3863ms | 0.1260ms | 7.9358 KOps/s | 7.7266 KOps/s | |
test_tdmodule | 57.6080μs | 25.7706μs | 38.8039 KOps/s | 39.0288 KOps/s | |
test_tdmodule_dispatch | 0.1050ms | 47.3603μs | 21.1147 KOps/s | 21.6366 KOps/s | |
test_tdseq | 64.2500μs | 28.3856μs | 35.2291 KOps/s | 33.9091 KOps/s | |
test_tdseq_dispatch | 0.1041ms | 53.0676μs | 18.8439 KOps/s | 18.8984 KOps/s | |
test_instantiation_functorch | 2.2491ms | 1.5316ms | 652.9192 Ops/s | 651.2219 Ops/s | |
test_exec_functorch | 0.4045ms | 0.1833ms | 5.4552 KOps/s | 5.5549 KOps/s | |
test_exec_functional_call | 0.3188ms | 0.1760ms | 5.6807 KOps/s | 5.7040 KOps/s | |
test_exec_td_decorator | 0.5072ms | 0.2417ms | 4.1372 KOps/s | 4.2408 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2421ms | 0.6710ms | 1.4903 KOps/s | 1.4843 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1854ms | 0.6644ms | 1.5052 KOps/s | 1.4820 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7880ms | 0.5336ms | 1.8739 KOps/s | 1.8054 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7456ms | 0.5320ms | 1.8795 KOps/s | 1.8321 KOps/s | |
test_to_module_speed[True] | 2.1819ms | 1.3388ms | 746.9411 Ops/s | 734.7716 Ops/s | |
test_to_module_speed[False] | 1.7902ms | 1.3178ms | 758.8141 Ops/s | 749.6107 Ops/s | |
test_tc_init | 80.7410μs | 46.2578μs | 21.6180 KOps/s | 22.4629 KOps/s | |
test_tc_init_nested | 0.1705ms | 91.5840μs | 10.9189 KOps/s | 10.9079 KOps/s | |
test_tc_first_layer_tensor | 44.6530μs | 1.5848μs | 630.9985 KOps/s | 633.9302 KOps/s | |
test_tc_first_layer_nontensor | 38.6130μs | 4.7425μs | 210.8600 KOps/s | 209.2564 KOps/s | |
test_tc_second_layer_tensor | 48.5010μs | 2.9213μs | 342.3165 KOps/s | 346.0670 KOps/s | |
test_tc_second_layer_nontensor | 39.7050μs | 6.2125μs | 160.9649 KOps/s | 164.1140 KOps/s | |
test_unbind | 0.2471s | 14.3891ms | 69.4969 Ops/s | 71.2481 Ops/s | |
test_full_like | 18.0294ms | 14.8669ms | 67.2634 Ops/s | 65.8478 Ops/s | |
test_zeros_like | 15.5734ms | 7.9803ms | 125.3086 Ops/s | 122.6833 Ops/s | |
test_ones_like | 12.7969ms | 8.0294ms | 124.5422 Ops/s | 115.6956 Ops/s | |
test_clone | 13.7325ms | 9.8813ms | 101.2010 Ops/s | 95.1072 Ops/s | |
test_squeeze | 62.9180μs | 12.7914μs | 78.1773 KOps/s | 82.1939 KOps/s | |
test_unsqueeze | 0.3395ms | 95.7065μs | 10.4486 KOps/s | 10.7924 KOps/s | |
test_split | 0.3366ms | 0.1931ms | 5.1784 KOps/s | 5.0584 KOps/s | |
test_permute | 0.3493ms | 0.2146ms | 4.6604 KOps/s | 4.7719 KOps/s | |
test_stack | 35.7260ms | 28.1859ms | 35.4787 Ops/s | 36.8882 Ops/s | |
test_cat | 31.4363ms | 27.2144ms | 36.7452 Ops/s | 36.9260 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.7010μs | 11.1826μs | 89.4243 KOps/s | 74.5600 KOps/s | |
test_plain_set_stack_nested | 34.4510μs | 11.3817μs | 87.8606 KOps/s | 74.4123 KOps/s | |
test_plain_set_nested_inplace | 39.9820μs | 12.3112μs | 81.2270 KOps/s | 69.2793 KOps/s | |
test_plain_set_stack_nested_inplace | 38.3010μs | 12.3162μs | 81.1939 KOps/s | 69.7329 KOps/s | |
test_items | 31.3110μs | 2.8790μs | 347.3403 KOps/s | 342.7070 KOps/s | |
test_items_nested | 0.4050ms | 0.3517ms | 2.8437 KOps/s | 2.8021 KOps/s | |
test_items_nested_locked | 0.4303ms | 0.3551ms | 2.8160 KOps/s | 2.8027 KOps/s | |
test_items_nested_leaf | 87.7530μs | 58.0541μs | 17.2253 KOps/s | 17.1160 KOps/s | |
test_items_stack_nested | 0.4092ms | 0.3544ms | 2.8220 KOps/s | 2.8087 KOps/s | |
test_items_stack_nested_leaf | 93.0630μs | 57.8895μs | 17.2743 KOps/s | 17.0672 KOps/s | |
test_items_stack_nested_locked | 0.5241ms | 0.3539ms | 2.8256 KOps/s | 2.8141 KOps/s | |
test_keys | 39.8220μs | 3.4304μs | 291.5071 KOps/s | 291.1401 KOps/s | |
test_keys_nested | 0.1145ms | 80.9869μs | 12.3477 KOps/s | 12.4072 KOps/s | |
test_keys_nested_locked | 0.7588ms | 86.5035μs | 11.5602 KOps/s | 11.4967 KOps/s | |
test_keys_nested_leaf | 2.6453ms | 72.5452μs | 13.7845 KOps/s | 14.0064 KOps/s | |
test_keys_stack_nested | 0.1498ms | 81.2697μs | 12.3047 KOps/s | 12.2993 KOps/s | |
test_keys_stack_nested_leaf | 0.1074ms | 72.3857μs | 13.8149 KOps/s | 13.9152 KOps/s | |
test_keys_stack_nested_locked | 0.1557ms | 87.0401μs | 11.4890 KOps/s | 11.3864 KOps/s | |
test_values | 5.7802μs | 0.8529μs | 1.1725 MOps/s | 1.1827 MOps/s | |
test_values_nested | 69.8030μs | 34.3595μs | 29.1041 KOps/s | 29.2888 KOps/s | |
test_values_nested_locked | 73.9820μs | 36.2205μs | 27.6086 KOps/s | 27.9115 KOps/s | |
test_values_nested_leaf | 91.2640μs | 38.5109μs | 25.9667 KOps/s | 25.6846 KOps/s | |
test_values_stack_nested | 67.8230μs | 34.0764μs | 29.3459 KOps/s | 28.9468 KOps/s | |
test_values_stack_nested_leaf | 75.6520μs | 38.6954μs | 25.8429 KOps/s | 25.3450 KOps/s | |
test_values_stack_nested_locked | 68.3020μs | 35.7107μs | 28.0028 KOps/s | 27.8549 KOps/s | |
test_membership | 1.9050μs | 0.5365μs | 1.8641 MOps/s | 1.8813 MOps/s | |
test_membership_nested | 15.8005μs | 1.9657μs | 508.7137 KOps/s | 492.0330 KOps/s | |
test_membership_nested_leaf | 13.8355μs | 1.9329μs | 517.3530 KOps/s | 509.4365 KOps/s | |
test_membership_stacked_nested | 26.5710μs | 2.0250μs | 493.8366 KOps/s | 485.9877 KOps/s | |
test_membership_stacked_nested_leaf | 25.5710μs | 2.0177μs | 495.6146 KOps/s | 487.4319 KOps/s | |
test_membership_nested_last | 37.8510μs | 3.0524μs | 327.6153 KOps/s | 338.4295 KOps/s | |
test_membership_nested_leaf_last | 25.8010μs | 3.0225μs | 330.8509 KOps/s | 330.2875 KOps/s | |
test_membership_stacked_nested_last | 35.4720μs | 3.0187μs | 331.2659 KOps/s | 327.3293 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.8810μs | 3.0532μs | 327.5245 KOps/s | 325.8161 KOps/s | |
test_nested_getleaf | 31.5910μs | 6.1891μs | 161.5746 KOps/s | 163.6444 KOps/s | |
test_nested_get | 31.4520μs | 5.7901μs | 172.7077 KOps/s | 172.5397 KOps/s | |
test_stacked_getleaf | 29.8710μs | 6.1744μs | 161.9603 KOps/s | 161.7827 KOps/s | |
test_stacked_get | 30.2010μs | 5.8194μs | 171.8389 KOps/s | 172.6538 KOps/s | |
test_nested_getitemleaf | 30.6810μs | 6.2491μs | 160.0230 KOps/s | 160.8836 KOps/s | |
test_nested_getitem | 29.8210μs | 5.9463μs | 168.1707 KOps/s | 169.9736 KOps/s | |
test_stacked_getitemleaf | 37.0010μs | 6.2359μs | 160.3611 KOps/s | 160.5497 KOps/s | |
test_stacked_getitem | 31.4820μs | 5.9519μs | 168.0125 KOps/s | 170.3483 KOps/s | |
test_lock_nested | 2.3440ms | 0.3686ms | 2.7129 KOps/s | 2.6286 KOps/s | |
test_lock_stack_nested | 0.3902ms | 0.3385ms | 2.9543 KOps/s | 2.9384 KOps/s | |
test_unlock_nested | 0.6362ms | 0.3056ms | 3.2728 KOps/s | 3.2349 KOps/s | |
test_unlock_stack_nested | 0.3395ms | 0.2789ms | 3.5859 KOps/s | 3.5890 KOps/s | |
test_flatten_speed | 0.1170ms | 73.8946μs | 13.5328 KOps/s | 13.5342 KOps/s | |
test_unflatten_speed | 0.3898ms | 0.3143ms | 3.1813 KOps/s | 3.1380 KOps/s | |
test_common_ops | 1.5298ms | 0.5662ms | 1.7660 KOps/s | 1.5782 KOps/s | |
test_creation | 0.1003ms | 1.7115μs | 584.2768 KOps/s | 583.7447 KOps/s | |
test_creation_empty | 33.2310μs | 6.2092μs | 161.0509 KOps/s | 95.9365 KOps/s | |
test_creation_nested_1 | 82.5430μs | 7.7759μs | 128.6030 KOps/s | 83.6483 KOps/s | |
test_creation_nested_2 | 32.4920μs | 10.4917μs | 95.3131 KOps/s | 67.7429 KOps/s | |
test_clone | 33.7310μs | 10.7975μs | 92.6141 KOps/s | 93.9472 KOps/s | |
test_getitem[int] | 1.9934ms | 10.4663μs | 95.5446 KOps/s | 95.2679 KOps/s | |
test_getitem[slice_int] | 0.1069ms | 20.6486μs | 48.4294 KOps/s | 49.1041 KOps/s | |
test_getitem[range] | 0.1323ms | 35.5880μs | 28.0994 KOps/s | 27.9438 KOps/s | |
test_getitem[tuple] | 0.1159ms | 17.8381μs | 56.0599 KOps/s | 56.3356 KOps/s | |
test_getitem[list] | 0.1306ms | 32.1403μs | 31.1136 KOps/s | 30.5611 KOps/s | |
test_setitem_dim[int] | 36.6320μs | 18.2385μs | 54.8289 KOps/s | 53.1392 KOps/s | |
test_setitem_dim[slice_int] | 68.3920μs | 37.9782μs | 26.3309 KOps/s | 26.1562 KOps/s | |
test_setitem_dim[range] | 73.8320μs | 51.0880μs | 19.5741 KOps/s | 19.4625 KOps/s | |
test_setitem_dim[tuple] | 53.0020μs | 30.9979μs | 32.2603 KOps/s | 31.4887 KOps/s | |
test_setitem | 50.0420μs | 14.2077μs | 70.3845 KOps/s | 62.3595 KOps/s | |
test_set | 46.0010μs | 13.4858μs | 74.1521 KOps/s | 63.7782 KOps/s | |
test_set_shared | 1.6336ms | 0.1504ms | 6.6509 KOps/s | 6.6563 KOps/s | |
test_update | 0.3329ms | 15.6165μs | 64.0347 KOps/s | 50.9493 KOps/s | |
test_update_nested | 0.1308ms | 20.7009μs | 48.3072 KOps/s | 39.0852 KOps/s | |
test_update__nested | 0.5866ms | 25.0753μs | 39.8799 KOps/s | 40.6615 KOps/s | |
test_set_nested | 0.1222ms | 15.0166μs | 66.5931 KOps/s | 59.3573 KOps/s | |
test_set_nested_new | 0.1353ms | 17.3399μs | 57.6703 KOps/s | 52.3297 KOps/s | |
test_select | 59.0120μs | 29.5560μs | 33.8341 KOps/s | 32.9088 KOps/s | |
test_select_nested | 75.9630μs | 43.3134μs | 23.0875 KOps/s | 22.9035 KOps/s | |
test_exclude_nested | 98.8540μs | 61.5568μs | 16.2452 KOps/s | 15.8880 KOps/s | |
test_empty[True] | 0.3559ms | 0.2841ms | 3.5204 KOps/s | 3.4973 KOps/s | |
test_empty[False] | 4.4312μs | 0.8354μs | 1.1970 MOps/s | 1.1827 MOps/s | |
test_to | 86.0830μs | 58.4019μs | 17.1227 KOps/s | 18.0035 KOps/s | |
test_to_nonblocking | 94.8240μs | 46.4267μs | 21.5394 KOps/s | 21.1403 KOps/s | |
test_unbind_speed | 1.5072ms | 0.2317ms | 4.3166 KOps/s | 4.2537 KOps/s | |
test_unbind_speed_stack0 | 0.2771ms | 0.2290ms | 4.3671 KOps/s | 4.2382 KOps/s | |
test_unbind_speed_stack1 | 92.6569ms | 0.6656ms | 1.5023 KOps/s | 1.4762 KOps/s | |
test_split | 94.2614ms | 1.7219ms | 580.7513 Ops/s | 581.8324 Ops/s | |
test_chunk | 95.9325ms | 1.5889ms | 629.3794 Ops/s | 693.5531 Ops/s | |
test_consolidate[False-None] | 2.8052ms | 2.6543ms | 376.7537 Ops/s | 374.1646 Ops/s | |
test_consolidate[default-None] | 1.7215ms | 1.6386ms | 610.2632 Ops/s | 610.4653 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.7668ms | 1.6926ms | 590.7919 Ops/s | 597.5385 Ops/s | |
test_consolidate_njt[False-None] | 6.6304ms | 6.4047ms | 156.1358 Ops/s | 158.2437 Ops/s | |
test_to[False-False-None] | 1.8453ms | 1.7685ms | 565.4575 Ops/s | 574.5291 Ops/s | |
test_to[True-False-None] | 1.5164ms | 1.2951ms | 772.1285 Ops/s | 784.4977 Ops/s | |
test_to[within-False-None] | 4.3522ms | 4.0257ms | 248.4029 Ops/s | 245.7512 Ops/s | |
test_to[True-default-None] | 5.4757ms | 5.2565ms | 190.2392 Ops/s | 192.0730 Ops/s | |
test_to_njt[False-False-None] | 7.0400ms | 6.8207ms | 146.6131 Ops/s | 145.3592 Ops/s | |
test_to_njt[True-False-None] | 5.6598ms | 5.4945ms | 182.0007 Ops/s | 185.1377 Ops/s | |
test_to_njt[within-False-None] | 12.2565ms | 12.0042ms | 83.3040 Ops/s | 84.1532 Ops/s | |
test_creation[device0] | 0.4572ms | 79.9583μs | 12.5065 KOps/s | 12.6041 KOps/s | |
test_creation_from_tensor | 0.5317ms | 83.3568μs | 11.9966 KOps/s | 12.1425 KOps/s | |
test_add_one[memmap_tensor0] | 0.3927ms | 6.9646μs | 143.5825 KOps/s | 147.8943 KOps/s | |
test_contiguous[memmap_tensor0] | 1.8321μs | 0.4083μs | 2.4494 MOps/s | 2.4281 MOps/s | |
test_stack[memmap_tensor0] | 43.1810μs | 4.3410μs | 230.3610 KOps/s | 234.4584 KOps/s | |
test_memmaptd_index | 1.5756ms | 0.2459ms | 4.0670 KOps/s | 4.0748 KOps/s | |
test_memmaptd_index_astensor | 0.6015ms | 0.3069ms | 3.2589 KOps/s | 3.2845 KOps/s | |
test_memmaptd_index_op | 1.0354ms | 0.5523ms | 1.8107 KOps/s | 1.6524 KOps/s | |
test_serialize_model | 0.1319s | 0.1307s | 7.6523 Ops/s | 7.6656 Ops/s | |
test_serialize_model_pickle | 1.3491s | 1.2114s | 0.8255 Ops/s | 0.8243 Ops/s | |
test_serialize_weights | 0.1304s | 0.1298s | 7.7018 Ops/s | 5.4363 Ops/s | |
test_serialize_weights_returnearly | 0.3271s | 55.2289ms | 18.1065 Ops/s | 23.8738 Ops/s | |
test_serialize_weights_pickle | 1.3690s | 1.2147s | 0.8233 Ops/s | 0.8374 Ops/s | |
test_reshape_pytree | 52.7520μs | 21.6759μs | 46.1342 KOps/s | 45.1361 KOps/s | |
test_reshape_td | 56.1820μs | 25.8947μs | 38.6180 KOps/s | 36.5800 KOps/s | |
test_view_pytree | 54.4320μs | 21.6531μs | 46.1828 KOps/s | 46.1698 KOps/s | |
test_view_td | 63.2620μs | 28.9946μs | 34.4892 KOps/s | 32.3249 KOps/s | |
test_unbind_pytree | 59.3220μs | 27.6500μs | 36.1664 KOps/s | 36.3687 KOps/s | |
test_unbind_td | 0.7705ms | 36.2153μs | 27.6126 KOps/s | 28.4786 KOps/s | |
test_split_pytree | 63.7020μs | 29.8625μs | 33.4868 KOps/s | 34.0048 KOps/s | |
test_split_td | 0.9908ms | 37.8714μs | 26.4051 KOps/s | 26.6327 KOps/s | |
test_add_pytree | 77.5620μs | 34.3422μs | 29.1187 KOps/s | 29.8698 KOps/s | |
test_add_td | 0.1920ms | 44.8118μs | 22.3156 KOps/s | 20.1993 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1741ms | 0.1191ms | 8.3986 KOps/s | 8.2696 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2225ms | 0.1289ms | 7.7583 KOps/s | 7.8420 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1882ms | 96.8632μs | 10.3238 KOps/s | 10.6706 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2072ms | 0.1512ms | 6.6128 KOps/s | 6.6108 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 54.1610μs | 22.2418μs | 44.9603 KOps/s | 43.6363 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 56.7820μs | 28.6312μs | 34.9270 KOps/s | 34.0963 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2207ms | 64.4382μs | 15.5187 KOps/s | 15.4522 KOps/s | |
test_compile_copy_nested[pytree-eager] | 85.0030μs | 49.1937μs | 20.3278 KOps/s | 20.3331 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1812ms | 0.1409ms | 7.0958 KOps/s | 7.0778 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3014ms | 0.2121ms | 4.7150 KOps/s | 4.7333 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2207ms | 99.3796μs | 10.0624 KOps/s | 10.3871 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1577ms | 56.8304μs | 17.5962 KOps/s | 19.3316 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1782ms | 0.1347ms | 7.4259 KOps/s | 7.3976 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5863ms | 0.4939ms | 2.0246 KOps/s | 2.0623 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3698ms | 0.2557ms | 3.9111 KOps/s | 3.8854 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1925ms | 0.1425ms | 7.0175 KOps/s | 7.0685 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1412ms | 65.0329μs | 15.3768 KOps/s | 15.8480 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1505ms | 97.9901μs | 10.2051 KOps/s | 10.4407 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4677ms | 0.4203ms | 2.3793 KOps/s | 2.4545 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1735ms | 0.1345ms | 7.4325 KOps/s | 7.3305 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 60.1030μs | 19.1064μs | 52.3385 KOps/s | 55.1272 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 74.1630μs | 31.1171μs | 32.1367 KOps/s | 31.8551 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1829ms | 70.7968μs | 14.1249 KOps/s | 14.5590 KOps/s | |
test_compile_copy_flat[pytree-eager] | 86.1030μs | 51.8485μs | 19.2870 KOps/s | 19.7155 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6089ms | 0.3874ms | 2.5816 KOps/s | 2.1419 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8407ms | 2.6319ms | 379.9570 Ops/s | 380.4529 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5685ms | 0.3756ms | 2.6622 KOps/s | 2.2558 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.1570ms | 2.6933ms | 371.2954 Ops/s | 370.2126 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.6352ms | 0.1157ms | 8.6452 KOps/s | 8.8079 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5720ms | 79.7242μs | 12.5432 KOps/s | 11.9920 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.5608ms | 0.1074ms | 9.3153 KOps/s | 9.5119 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.4954ms | 68.5071μs | 14.5970 KOps/s | 14.3774 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.5540ms | 0.1088ms | 9.1871 KOps/s | 9.4391 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.5008ms | 68.8363μs | 14.5272 KOps/s | 14.5160 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1678ms | 0.1023ms | 9.7769 KOps/s | 10.1436 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4451ms | 16.4343μs | 60.8483 KOps/s | 58.7826 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.5261ms | 98.7430μs | 10.1273 KOps/s | 10.3157 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.4312ms | 15.6806μs | 63.7732 KOps/s | 65.0979 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.5152ms | 98.2987μs | 10.1731 KOps/s | 10.4326 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 51.6320μs | 15.2441μs | 65.5993 KOps/s | 64.4069 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.5144ms | 0.1035ms | 9.6598 KOps/s | 10.0376 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5931ms | 16.2417μs | 61.5698 KOps/s | 60.1354 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.5194ms | 98.3690μs | 10.1658 KOps/s | 10.5240 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.4165ms | 15.4372μs | 64.7784 KOps/s | 64.3365 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.5028ms | 96.8096μs | 10.3295 KOps/s | 10.5176 KOps/s | |
test_compile_indexing[int-pytree-eager] | 46.6620μs | 15.5614μs | 64.2614 KOps/s | 65.0505 KOps/s | |
test_mod_add[eager] | 83.1030μs | 36.7518μs | 27.2095 KOps/s | 25.7176 KOps/s | |
test_mod_add[compile] | 0.2975ms | 77.3196μs | 12.9333 KOps/s | 12.4937 KOps/s | |
test_mod_add[compile-overhead] | 0.3303ms | 0.1669ms | 5.9932 KOps/s | 5.7870 KOps/s | |
test_mod_wrap[eager] | 0.3298ms | 0.2538ms | 3.9400 KOps/s | 3.9625 KOps/s | |
test_mod_wrap[compile] | 0.4726ms | 0.2784ms | 3.5913 KOps/s | 3.5537 KOps/s | |
test_mod_wrap[compile-overhead] | 7.0615ms | 3.7643ms | 265.6532 Ops/s | 250.2545 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4833ms | 1.3637ms | 733.3172 Ops/s | 686.0149 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3215ms | 1.2512ms | 799.2266 Ops/s | 730.8608 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3897ms | 0.9228ms | 1.0836 KOps/s | 968.5119 Ops/s | |
test_seq_add[eager] | 0.1628ms | 0.1131ms | 8.8427 KOps/s | 8.4423 KOps/s | |
test_seq_add[compile] | 0.1375ms | 90.9918μs | 10.9900 KOps/s | 11.4806 KOps/s | |
test_seq_add[compile-overhead] | 0.1679ms | 0.1317ms | 7.5921 KOps/s | 7.8908 KOps/s | |
test_seq_wrap[eager] | 0.4868ms | 0.4224ms | 2.3675 KOps/s | 2.3620 KOps/s | |
test_seq_wrap[compile] | 0.3628ms | 0.3094ms | 3.2320 KOps/s | 3.3443 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2736ms | 0.2216ms | 4.5125 KOps/s | 4.3243 KOps/s | |
test_func_call_runtime[False-eager] | 0.8045ms | 0.7315ms | 1.3670 KOps/s | 1.2790 KOps/s | |
test_func_call_runtime[False-compile] | 0.7858ms | 0.7375ms | 1.3559 KOps/s | 1.3534 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4176ms | 0.3572ms | 2.7993 KOps/s | 2.7789 KOps/s | |
test_func_call_runtime[True-eager] | 0.9695ms | 0.8973ms | 1.1145 KOps/s | 1.0994 KOps/s | |
test_func_call_runtime[True-compile] | 0.8338ms | 0.7580ms | 1.3192 KOps/s | 1.3195 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4362ms | 0.3803ms | 2.6296 KOps/s | 2.6600 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8704ms | 0.7689ms | 1.3005 KOps/s | 1.3607 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8278ms | 0.7406ms | 1.3502 KOps/s | 1.3100 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4097ms | 0.3630ms | 2.7551 KOps/s | 2.7550 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1095ms | 0.9982ms | 1.0018 KOps/s | 989.3443 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8504ms | 0.7875ms | 1.2698 KOps/s | 1.2805 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5761ms | 0.4059ms | 2.4639 KOps/s | 2.4713 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5738ms | 2.0913ms | 478.1667 Ops/s | 473.4929 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9267ms | 0.8380ms | 1.1933 KOps/s | 1.2412 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4686ms | 0.4185ms | 2.3897 KOps/s | 2.4529 KOps/s | |
test_distributed | 0.5918ms | 0.1253ms | 7.9825 KOps/s | 8.6264 KOps/s | |
test_tdmodule | 0.2565ms | 18.0765μs | 55.3204 KOps/s | 46.8007 KOps/s | |
test_tdmodule_dispatch | 69.8620μs | 31.7954μs | 31.4511 KOps/s | 26.6273 KOps/s | |
test_tdseq | 42.2910μs | 19.7659μs | 50.5921 KOps/s | 45.4771 KOps/s | |
test_tdseq_dispatch | 51.3720μs | 35.4612μs | 28.1998 KOps/s | 24.3406 KOps/s | |
test_instantiation_functorch | 1.6656ms | 1.5198ms | 657.9986 Ops/s | 660.2155 Ops/s | |
test_exec_functorch | 0.1886ms | 0.1425ms | 7.0192 KOps/s | 7.0142 KOps/s | |
test_exec_functional_call | 0.2278ms | 0.1377ms | 7.2645 KOps/s | 7.3401 KOps/s | |
test_exec_td_decorator | 0.3752ms | 0.1828ms | 5.4714 KOps/s | 5.4387 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7900ms | 0.6765ms | 1.4781 KOps/s | 1.4493 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8047ms | 0.6755ms | 1.4805 KOps/s | 1.4487 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7408ms | 0.6067ms | 1.6483 KOps/s | 1.6771 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7323ms | 0.5935ms | 1.6849 KOps/s | 1.6678 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.6044ms | 19.2990ms | 51.8162 Ops/s | 51.8509 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.0805ms | 19.3339ms | 51.7225 Ops/s | 51.7737 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.4061ms | 19.1917ms | 52.1058 Ops/s | 52.3766 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.3446ms | 19.2441ms | 51.9640 Ops/s | 52.3606 Ops/s | |
test_to_module_speed[True] | 1.0556ms | 0.9564ms | 1.0456 KOps/s | 1.0318 KOps/s | |
test_to_module_speed[False] | 1.3241ms | 0.9357ms | 1.0687 KOps/s | 1.0439 KOps/s | |
test_tc_init | 70.3320μs | 33.9453μs | 29.4592 KOps/s | 26.0696 KOps/s | |
test_tc_init_nested | 0.1064ms | 67.6032μs | 14.7922 KOps/s | 12.9944 KOps/s | |
test_tc_first_layer_tensor | 20.8900μs | 0.8026μs | 1.2459 MOps/s | 1.4388 MOps/s | |
test_tc_first_layer_nontensor | 20.8010μs | 2.2244μs | 449.5570 KOps/s | 450.5018 KOps/s | |
test_tc_second_layer_tensor | 10.7630μs | 1.4094μs | 709.5031 KOps/s | 690.2949 KOps/s | |
test_tc_second_layer_nontensor | 88.9830μs | 2.9526μs | 338.6878 KOps/s | 334.5468 KOps/s | |
test_unbind | 0.2239s | 10.1016ms | 98.9942 Ops/s | 141.4013 Ops/s | |
test_full_like | 11.7459ms | 9.1392ms | 109.4191 Ops/s | 107.0710 Ops/s | |
test_zeros_like | 5.2098ms | 4.3183ms | 231.5724 Ops/s | 237.0540 Ops/s | |
test_ones_like | 4.9650ms | 4.2770ms | 233.8070 Ops/s | 236.9089 Ops/s | |
test_clone | 6.7667ms | 6.3594ms | 157.2478 Ops/s | 109.9461 Ops/s | |
test_squeeze | 58.6620μs | 9.7305μs | 102.7697 KOps/s | 108.4870 KOps/s | |
test_unsqueeze | 0.1527ms | 70.6127μs | 14.1618 KOps/s | 14.0393 KOps/s | |
test_split | 0.4242ms | 0.1567ms | 6.3811 KOps/s | 6.2762 KOps/s | |
test_permute | 0.3054ms | 0.1775ms | 5.6341 KOps/s | 5.6821 KOps/s | |
test_stack | 50.4909ms | 50.1878ms | 19.9252 Ops/s | 19.8953 Ops/s | |
test_cat | 50.4904ms | 50.1247ms | 19.9503 Ops/s | 19.7388 Ops/s |
This was referenced Jan 7, 2025
vmoens
added a commit
that referenced
this pull request
Jan 7, 2025
ghstack-source-id: 84148ad9c701029db6d02dfb84ddb0a9b26c9ab7 Pull Request resolved: #1162
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):