-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix unitary ops for tensorclass #1164
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Jan 7, 2025
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 7, 2025
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 45.9160μs | 21.5347μs | 46.4366 KOps/s | 48.7287 KOps/s | |
test_plain_set_stack_nested | 57.0570μs | 21.5723μs | 46.3558 KOps/s | 48.6383 KOps/s | |
test_plain_set_nested_inplace | 73.6680μs | 23.5956μs | 42.3808 KOps/s | 44.9699 KOps/s | |
test_plain_set_stack_nested_inplace | 70.0580μs | 23.3657μs | 42.7978 KOps/s | 45.5762 KOps/s | |
test_items | 23.6150μs | 4.2643μs | 234.5036 KOps/s | 242.1309 KOps/s | |
test_items_nested | 0.8422ms | 0.4019ms | 2.4881 KOps/s | 2.4851 KOps/s | |
test_items_nested_locked | 0.5562ms | 0.4017ms | 2.4891 KOps/s | 2.4955 KOps/s | |
test_items_nested_leaf | 0.1374ms | 77.7329μs | 12.8646 KOps/s | 12.5505 KOps/s | |
test_items_stack_nested | 0.5415ms | 0.4040ms | 2.4751 KOps/s | 2.4692 KOps/s | |
test_items_stack_nested_leaf | 0.1439ms | 80.4928μs | 12.4235 KOps/s | 12.6349 KOps/s | |
test_items_stack_nested_locked | 0.5771ms | 0.4034ms | 2.4788 KOps/s | 2.4689 KOps/s | |
test_keys | 42.4890μs | 3.9302μs | 254.4393 KOps/s | 283.4070 KOps/s | |
test_keys_nested | 0.2673ms | 0.1645ms | 6.0809 KOps/s | 5.9793 KOps/s | |
test_keys_nested_locked | 0.7049ms | 0.1728ms | 5.7879 KOps/s | 5.7798 KOps/s | |
test_keys_nested_leaf | 0.2002ms | 0.1439ms | 6.9486 KOps/s | 6.8030 KOps/s | |
test_keys_stack_nested | 0.2955ms | 0.1635ms | 6.1167 KOps/s | 5.9742 KOps/s | |
test_keys_stack_nested_leaf | 0.2386ms | 0.1421ms | 7.0358 KOps/s | 6.9128 KOps/s | |
test_keys_stack_nested_locked | 0.3280ms | 0.1689ms | 5.9198 KOps/s | 5.6847 KOps/s | |
test_values | 8.3556μs | 1.0398μs | 961.7021 KOps/s | 956.4555 KOps/s | |
test_values_nested | 0.1470ms | 64.3737μs | 15.5343 KOps/s | 15.8201 KOps/s | |
test_values_nested_locked | 0.1154ms | 63.7035μs | 15.6977 KOps/s | 15.3685 KOps/s | |
test_values_nested_leaf | 0.1599ms | 73.0381μs | 13.6915 KOps/s | 13.8578 KOps/s | |
test_values_stack_nested | 0.1109ms | 64.9679μs | 15.3922 KOps/s | 15.7209 KOps/s | |
test_values_stack_nested_leaf | 0.1385ms | 73.0556μs | 13.6882 KOps/s | 13.7864 KOps/s | |
test_values_stack_nested_locked | 0.1163ms | 64.8984μs | 15.4087 KOps/s | 15.7904 KOps/s | |
test_membership | 4.7160μs | 0.7192μs | 1.3904 MOps/s | 1.3574 MOps/s | |
test_membership_nested | 33.2420μs | 2.8777μs | 347.5045 KOps/s | 335.4113 KOps/s | |
test_membership_nested_leaf | 42.7600μs | 2.9247μs | 341.9186 KOps/s | 346.2525 KOps/s | |
test_membership_stacked_nested | 27.8220μs | 2.9174μs | 342.7656 KOps/s | 342.9114 KOps/s | |
test_membership_stacked_nested_leaf | 33.4030μs | 2.9000μs | 344.8218 KOps/s | 340.0150 KOps/s | |
test_membership_nested_last | 31.5600μs | 4.3187μs | 231.5511 KOps/s | 231.6628 KOps/s | |
test_membership_nested_leaf_last | 30.8780μs | 4.3681μs | 228.9329 KOps/s | 230.5081 KOps/s | |
test_membership_stacked_nested_last | 33.5730μs | 7.0679μs | 141.4845 KOps/s | 194.0921 KOps/s | |
test_membership_stacked_nested_leaf_last | 37.7710μs | 7.0140μs | 142.5720 KOps/s | 195.4212 KOps/s | |
test_nested_getleaf | 47.0180μs | 10.7790μs | 92.7729 KOps/s | 93.0242 KOps/s | |
test_nested_get | 42.7400μs | 10.1160μs | 98.8537 KOps/s | 98.4610 KOps/s | |
test_stacked_getleaf | 41.2870μs | 10.8084μs | 92.5209 KOps/s | 92.9894 KOps/s | |
test_stacked_get | 33.5230μs | 10.1399μs | 98.6203 KOps/s | 98.8342 KOps/s | |
test_nested_getitemleaf | 50.3640μs | 11.1105μs | 90.0047 KOps/s | 88.7372 KOps/s | |
test_nested_getitem | 36.6390μs | 10.4825μs | 95.3966 KOps/s | 94.6494 KOps/s | |
test_stacked_getitemleaf | 36.1080μs | 11.0526μs | 90.4768 KOps/s | 89.2396 KOps/s | |
test_stacked_getitem | 45.6660μs | 10.4680μs | 95.5296 KOps/s | 96.5254 KOps/s | |
test_lock_nested | 0.9208ms | 0.4573ms | 2.1866 KOps/s | 2.2306 KOps/s | |
test_lock_stack_nested | 0.6606ms | 0.4273ms | 2.3405 KOps/s | 2.3594 KOps/s | |
test_unlock_nested | 0.7811ms | 0.3789ms | 2.6393 KOps/s | 2.6971 KOps/s | |
test_unlock_stack_nested | 0.5276ms | 0.3432ms | 2.9136 KOps/s | 2.9307 KOps/s | |
test_flatten_speed | 0.3060ms | 0.1059ms | 9.4438 KOps/s | 10.0688 KOps/s | |
test_unflatten_speed | 1.2180ms | 0.5331ms | 1.8757 KOps/s | 1.8836 KOps/s | |
test_common_ops | 2.0489ms | 0.8411ms | 1.1889 KOps/s | 1.3270 KOps/s | |
test_creation | 19.3860μs | 2.5396μs | 393.7627 KOps/s | 354.3792 KOps/s | |
test_creation_empty | 41.2870μs | 13.0814μs | 76.4442 KOps/s | 90.8394 KOps/s | |
test_creation_nested_1 | 60.9240μs | 16.0096μs | 62.4627 KOps/s | 71.4434 KOps/s | |
test_creation_nested_2 | 60.4330μs | 20.3441μs | 49.1543 KOps/s | 54.7174 KOps/s | |
test_clone | 68.9590μs | 13.9708μs | 71.5777 KOps/s | 75.0323 KOps/s | |
test_getitem[int] | 1.2968ms | 13.0549μs | 76.5998 KOps/s | 77.4360 KOps/s | |
test_getitem[slice_int] | 0.1400ms | 25.1611μs | 39.7439 KOps/s | 40.9997 KOps/s | |
test_getitem[range] | 0.1716ms | 51.2746μs | 19.5029 KOps/s | 20.8608 KOps/s | |
test_getitem[tuple] | 0.1528ms | 20.5354μs | 48.6965 KOps/s | 49.7644 KOps/s | |
test_getitem[list] | 0.2454ms | 46.7496μs | 21.3906 KOps/s | 22.9559 KOps/s | |
test_setitem_dim[int] | 58.0390μs | 26.3207μs | 37.9929 KOps/s | 39.8511 KOps/s | |
test_setitem_dim[slice_int] | 90.2690μs | 52.1139μs | 19.1888 KOps/s | 19.7096 KOps/s | |
test_setitem_dim[range] | 0.1330ms | 74.8685μs | 13.3567 KOps/s | 13.7820 KOps/s | |
test_setitem_dim[tuple] | 77.3650μs | 41.2586μs | 24.2374 KOps/s | 24.9901 KOps/s | |
test_setitem | 68.9490μs | 22.3723μs | 44.6980 KOps/s | 51.1501 KOps/s | |
test_set | 0.2297ms | 21.8401μs | 45.7874 KOps/s | 51.9641 KOps/s | |
test_set_shared | 1.1692ms | 0.1686ms | 5.9303 KOps/s | 5.8341 KOps/s | |
test_update | 0.2100ms | 25.4277μs | 39.3273 KOps/s | 45.4886 KOps/s | |
test_update_nested | 0.1857ms | 36.5535μs | 27.3571 KOps/s | 31.1857 KOps/s | |
test_update__nested | 0.7188ms | 35.5680μs | 28.1152 KOps/s | 29.7958 KOps/s | |
test_set_nested | 0.1318ms | 23.9003μs | 41.8405 KOps/s | 47.2281 KOps/s | |
test_set_nested_new | 0.1547ms | 28.6634μs | 34.8876 KOps/s | 38.4418 KOps/s | |
test_select | 95.0180μs | 44.2952μs | 22.5758 KOps/s | 23.7434 KOps/s | |
test_select_nested | 0.1472ms | 62.5114μs | 15.9971 KOps/s | 15.9159 KOps/s | |
test_exclude_nested | 0.1715ms | 80.5936μs | 12.4079 KOps/s | 12.2701 KOps/s | |
test_empty[True] | 0.7411ms | 0.4126ms | 2.4237 KOps/s | 2.4340 KOps/s | |
test_empty[False] | 12.9197μs | 1.3802μs | 724.5389 KOps/s | 713.9013 KOps/s | |
test_unbind_speed | 0.5727ms | 0.2679ms | 3.7323 KOps/s | 3.6984 KOps/s | |
test_unbind_speed_stack0 | 0.4098ms | 0.2636ms | 3.7932 KOps/s | 3.7603 KOps/s | |
test_unbind_speed_stack1 | 97.1881ms | 0.7752ms | 1.2899 KOps/s | 1.3741 KOps/s | |
test_split | 1.7245ms | 1.5980ms | 625.7881 Ops/s | 570.7525 Ops/s | |
test_chunk | 0.1011s | 1.9399ms | 515.5011 Ops/s | 566.9413 Ops/s | |
test_consolidate_njt[False-None] | 8.7853ms | 8.1935ms | 122.0475 Ops/s | 122.9360 Ops/s | |
test_creation[device0] | 4.1043ms | 92.3630μs | 10.8268 KOps/s | 11.0654 KOps/s | |
test_creation_from_tensor | 0.2553ms | 92.9740μs | 10.7557 KOps/s | 10.5423 KOps/s | |
test_add_one[memmap_tensor0] | 0.2091ms | 5.1132μs | 195.5711 KOps/s | 209.3116 KOps/s | |
test_contiguous[memmap_tensor0] | 19.4560μs | 0.5160μs | 1.9378 MOps/s | 1.9454 MOps/s | |
test_stack[memmap_tensor0] | 28.0930μs | 3.4068μs | 293.5309 KOps/s | 303.7055 KOps/s | |
test_memmaptd_index | 0.9609ms | 0.2478ms | 4.0348 KOps/s | 4.1473 KOps/s | |
test_memmaptd_index_astensor | 0.8328ms | 0.3370ms | 2.9675 KOps/s | 3.0311 KOps/s | |
test_memmaptd_index_op | 1.1723ms | 0.6387ms | 1.5657 KOps/s | 1.7584 KOps/s | |
test_serialize_model | 0.1247s | 0.1142s | 8.7548 Ops/s | 8.5664 Ops/s | |
test_serialize_model_pickle | 0.4430s | 0.3858s | 2.5921 Ops/s | 2.5505 Ops/s | |
test_serialize_weights | 0.2137s | 0.1282s | 7.7978 Ops/s | 7.7421 Ops/s | |
test_serialize_weights_returnearly | 0.1733s | 0.1593s | 6.2769 Ops/s | 6.6238 Ops/s | |
test_serialize_weights_pickle | 0.5405s | 0.4513s | 2.2158 Ops/s | 1.1100 Ops/s | |
test_serialize_weights_filesystem | 0.1431s | 0.1393s | 7.1795 Ops/s | 7.1615 Ops/s | |
test_serialize_model_filesystem | 0.2337s | 0.1557s | 6.4207 Ops/s | 6.4001 Ops/s | |
test_reshape_pytree | 60.5230μs | 26.3098μs | 38.0086 KOps/s | 37.8387 KOps/s | |
test_reshape_td | 82.3840μs | 33.7398μs | 29.6386 KOps/s | 30.4706 KOps/s | |
test_view_pytree | 82.0640μs | 26.4710μs | 37.7772 KOps/s | 37.6683 KOps/s | |
test_view_td | 82.0440μs | 39.3637μs | 25.4041 KOps/s | 25.9240 KOps/s | |
test_unbind_pytree | 83.3960μs | 29.9709μs | 33.3657 KOps/s | 33.8430 KOps/s | |
test_unbind_td | 0.3551ms | 39.4115μs | 25.3733 KOps/s | 25.3580 KOps/s | |
test_split_pytree | 75.6910μs | 29.1812μs | 34.2687 KOps/s | 34.3841 KOps/s | |
test_split_td | 0.5176ms | 45.3547μs | 22.0484 KOps/s | 21.9783 KOps/s | |
test_add_pytree | 79.8390μs | 35.7160μs | 27.9987 KOps/s | 28.8904 KOps/s | |
test_add_td | 0.1300ms | 62.2248μs | 16.0708 KOps/s | 17.8986 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1207ms | 61.7820μs | 16.1859 KOps/s | 16.7744 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3621ms | 0.1707ms | 5.8580 KOps/s | 5.8768 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1025ms | 45.3846μs | 22.0339 KOps/s | 22.5176 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.4079ms | 0.1201ms | 8.3291 KOps/s | 8.4607 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 66.3540μs | 25.7245μs | 38.8734 KOps/s | 39.7519 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1098ms | 58.6877μs | 17.0393 KOps/s | 17.3810 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1651ms | 78.0993μs | 12.8042 KOps/s | 12.6305 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.2620ms | 67.2934μs | 14.8603 KOps/s | 14.5013 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1823ms | 0.1040ms | 9.6143 KOps/s | 9.6180 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4827ms | 0.2165ms | 4.6182 KOps/s | 4.6168 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 97.7740μs | 44.6002μs | 22.4214 KOps/s | 23.3097 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5078ms | 66.1122μs | 15.1258 KOps/s | 15.7837 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2197ms | 0.1032ms | 9.6907 KOps/s | 9.8218 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6620ms | 0.2004ms | 4.9909 KOps/s | 5.0155 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4371ms | 0.2352ms | 4.2509 KOps/s | 4.2811 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4004ms | 0.1111ms | 9.0031 KOps/s | 9.6507 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2694ms | 61.7303μs | 16.1995 KOps/s | 17.4790 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1699ms | 45.5328μs | 21.9622 KOps/s | 22.7918 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6018ms | 0.1566ms | 6.3864 KOps/s | 6.2409 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4189ms | 0.1032ms | 9.6859 KOps/s | 9.6991 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 56.7070μs | 21.1680μs | 47.2411 KOps/s | 46.2371 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1505ms | 66.7893μs | 14.9725 KOps/s | 15.1008 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1595ms | 78.5067μs | 12.7378 KOps/s | 12.6608 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1405ms | 66.5822μs | 15.0190 KOps/s | 14.6611 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3063ms | 0.2043ms | 4.8942 KOps/s | 4.8532 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.5127ms | 1.3095ms | 763.6507 Ops/s | 747.6470 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2843ms | 0.2023ms | 4.9429 KOps/s | 4.9467 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3826ms | 0.7722ms | 1.2951 KOps/s | 1.3008 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8197ms | 0.4654ms | 2.1489 KOps/s | 2.1825 KOps/s | |
test_compile_assign_and_add_stack[eager] | 2.9731ms | 2.8395ms | 352.1694 Ops/s | 380.1681 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 93.2450μs | 35.9194μs | 27.8401 KOps/s | 29.1301 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.7802ms | 34.6507μs | 28.8594 KOps/s | 30.5642 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 74.4990μs | 28.9757μs | 34.5116 KOps/s | 35.0006 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 64.5910μs | 23.3219μs | 42.8782 KOps/s | 43.0911 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 92.2630μs | 30.2134μs | 33.0979 KOps/s | 33.7431 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 91.4110μs | 22.8565μs | 43.7512 KOps/s | 43.9227 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1136ms | 50.7104μs | 19.7198 KOps/s | 19.5846 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3884ms | 20.2524μs | 49.3769 KOps/s | 49.2554 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 96.8210μs | 43.8131μs | 22.8242 KOps/s | 22.7377 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 64.4810μs | 18.9792μs | 52.6891 KOps/s | 53.4690 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1333ms | 44.8176μs | 22.3127 KOps/s | 22.3032 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 81.3220μs | 18.4043μs | 54.3350 KOps/s | 52.7068 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1420ms | 52.4638μs | 19.0608 KOps/s | 19.1199 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.1865ms | 20.5354μs | 48.6963 KOps/s | 48.9697 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 93.8760μs | 44.7127μs | 22.3650 KOps/s | 22.0618 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 84.9800μs | 18.5362μs | 53.9484 KOps/s | 53.9186 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1211ms | 44.9389μs | 22.2525 KOps/s | 22.4236 KOps/s | |
test_compile_indexing[int-pytree-eager] | 78.8080μs | 19.9540μs | 50.1153 KOps/s | 54.0156 KOps/s | |
test_mod_add[eager] | 99.7470μs | 36.2041μs | 27.6212 KOps/s | 29.7191 KOps/s | |
test_mod_add[compile] | 0.1239ms | 47.0391μs | 21.2589 KOps/s | 21.4165 KOps/s | |
test_mod_add[compile-overhead] | 0.1165ms | 46.6396μs | 21.4410 KOps/s | 21.3739 KOps/s | |
test_mod_wrap[eager] | 0.4957ms | 0.2217ms | 4.5105 KOps/s | 4.5306 KOps/s | |
test_mod_wrap[compile] | 0.3779ms | 0.2040ms | 4.9025 KOps/s | 4.7010 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3794ms | 0.2041ms | 4.8998 KOps/s | 4.8180 KOps/s | |
test_mod_wrap_and_backward[eager] | 19.7789ms | 13.4508ms | 74.3450 Ops/s | 91.2975 Ops/s | |
test_mod_wrap_and_backward[compile] | 17.9690ms | 13.7586ms | 72.6816 Ops/s | 91.3368 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 16.1809ms | 12.7986ms | 78.1336 Ops/s | 90.0115 Ops/s | |
test_seq_add[eager] | 0.2535ms | 0.1190ms | 8.4064 KOps/s | 9.0148 KOps/s | |
test_seq_add[compile] | 0.1436ms | 62.9323μs | 15.8901 KOps/s | 16.6019 KOps/s | |
test_seq_add[compile-overhead] | 0.1323ms | 60.3165μs | 16.5792 KOps/s | 17.0709 KOps/s | |
test_seq_wrap[eager] | 0.7574ms | 0.4453ms | 2.2455 KOps/s | 2.2775 KOps/s | |
test_seq_wrap[compile] | 0.4276ms | 0.2288ms | 4.3702 KOps/s | 4.3542 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3624ms | 0.2263ms | 4.4199 KOps/s | 4.3979 KOps/s | |
test_func_call_runtime[False-eager] | 0.9382ms | 0.5495ms | 1.8197 KOps/s | 1.8877 KOps/s | |
test_func_call_runtime[False-compile] | 0.8049ms | 0.4349ms | 2.2993 KOps/s | 2.3602 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8050ms | 0.4362ms | 2.2924 KOps/s | 2.3540 KOps/s | |
test_func_call_runtime[True-eager] | 0.9598ms | 0.7739ms | 1.2921 KOps/s | 1.3224 KOps/s | |
test_func_call_runtime[True-compile] | 0.8744ms | 0.4769ms | 2.0971 KOps/s | 2.1481 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5757ms | 0.4726ms | 2.1161 KOps/s | 2.1426 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9843ms | 0.5693ms | 1.7566 KOps/s | 1.8798 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7758ms | 0.4301ms | 2.3248 KOps/s | 2.3226 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5814ms | 0.4287ms | 2.3329 KOps/s | 2.3626 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.3737ms | 0.9173ms | 1.0901 KOps/s | 1.1275 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6033ms | 0.4950ms | 2.0200 KOps/s | 2.0337 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.7711ms | 0.4961ms | 2.0158 KOps/s | 2.0465 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4427ms | 1.8987ms | 526.6645 Ops/s | 516.3873 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0666ms | 0.5196ms | 1.9246 KOps/s | 1.9060 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7428ms | 0.5161ms | 1.9375 KOps/s | 1.9095 KOps/s | |
test_distributed | 0.2587ms | 0.1255ms | 7.9692 KOps/s | 7.8066 KOps/s | |
test_tdmodule | 55.4840μs | 26.3921μs | 37.8901 KOps/s | 39.3800 KOps/s | |
test_tdmodule_dispatch | 75.8820μs | 48.1950μs | 20.7491 KOps/s | 19.7307 KOps/s | |
test_tdseq | 47.0590μs | 29.7543μs | 33.6086 KOps/s | 34.8220 KOps/s | |
test_tdseq_dispatch | 84.2180μs | 54.9245μs | 18.2068 KOps/s | 19.0674 KOps/s | |
test_instantiation_functorch | 2.8232ms | 1.5573ms | 642.1433 Ops/s | 646.1144 Ops/s | |
test_exec_functorch | 0.3311ms | 0.1813ms | 5.5154 KOps/s | 5.5769 KOps/s | |
test_exec_functional_call | 0.4311ms | 0.1753ms | 5.7037 KOps/s | 5.8307 KOps/s | |
test_exec_td_decorator | 0.4446ms | 0.2341ms | 4.2722 KOps/s | 4.2527 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7653ms | 0.6514ms | 1.5352 KOps/s | 1.5174 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8656ms | 0.6592ms | 1.5171 KOps/s | 1.5239 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8028ms | 0.5285ms | 1.8920 KOps/s | 1.8436 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8091ms | 0.5311ms | 1.8828 KOps/s | 1.8605 KOps/s | |
test_to_module_speed[True] | 1.6530ms | 1.3324ms | 750.5155 Ops/s | 731.5214 Ops/s | |
test_to_module_speed[False] | 1.8099ms | 1.2999ms | 769.2620 Ops/s | 757.6721 Ops/s | |
test_tc_init | 81.9640μs | 49.1734μs | 20.3362 KOps/s | 21.3715 KOps/s | |
test_tc_init_nested | 0.1911ms | 96.8871μs | 10.3213 KOps/s | 10.6348 KOps/s | |
test_tc_first_layer_tensor | 17.1220μs | 1.5608μs | 640.6984 KOps/s | 665.0398 KOps/s | |
test_tc_first_layer_nontensor | 24.4750μs | 4.8530μs | 206.0595 KOps/s | 213.0473 KOps/s | |
test_tc_second_layer_tensor | 34.5140μs | 2.9004μs | 344.7837 KOps/s | 349.1528 KOps/s | |
test_tc_second_layer_nontensor | 26.5700μs | 6.2665μs | 159.5797 KOps/s | 165.6032 KOps/s | |
test_unbind | 0.2159s | 13.2058ms | 75.7244 Ops/s | 77.8041 Ops/s | |
test_full_like | 19.2066ms | 10.9520ms | 91.3076 Ops/s | 132.1902 Ops/s | |
test_zeros_like | 12.3861ms | 7.2682ms | 137.5851 Ops/s | 315.3309 Ops/s | |
test_ones_like | 12.0888ms | 7.2139ms | 138.6220 Ops/s | 162.3271 Ops/s | |
test_clone | 16.1671ms | 8.7794ms | 113.9027 Ops/s | 125.3611 Ops/s | |
test_squeeze | 58.7910μs | 12.2639μs | 81.5400 KOps/s | 80.7259 KOps/s | |
test_unsqueeze | 0.2426ms | 93.0476μs | 10.7472 KOps/s | 11.0722 KOps/s | |
test_split | 0.4347ms | 0.2003ms | 4.9923 KOps/s | 5.1709 KOps/s | |
test_permute | 0.3053ms | 0.2044ms | 4.8916 KOps/s | 4.9106 KOps/s | |
test_stack | 26.5248ms | 24.2563ms | 41.2264 Ops/s | 39.9277 Ops/s | |
test_cat | 26.0377ms | 23.8779ms | 41.8797 Ops/s | 40.2992 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 32.4810μs | 11.3650μs | 87.9893 KOps/s | 76.6110 KOps/s | |
test_plain_set_stack_nested | 38.4210μs | 11.6777μs | 85.6334 KOps/s | 75.0563 KOps/s | |
test_plain_set_nested_inplace | 37.7110μs | 12.6588μs | 78.9966 KOps/s | 69.6893 KOps/s | |
test_plain_set_stack_nested_inplace | 36.7400μs | 12.6816μs | 78.8547 KOps/s | 69.9451 KOps/s | |
test_items | 25.5710μs | 2.8688μs | 348.5822 KOps/s | 343.8163 KOps/s | |
test_items_nested | 0.4174ms | 0.3592ms | 2.7841 KOps/s | 2.8332 KOps/s | |
test_items_nested_locked | 0.4163ms | 0.3596ms | 2.7810 KOps/s | 2.8239 KOps/s | |
test_items_nested_leaf | 90.1220μs | 58.9224μs | 16.9715 KOps/s | 17.1641 KOps/s | |
test_items_stack_nested | 0.4321ms | 0.3643ms | 2.7451 KOps/s | 2.7892 KOps/s | |
test_items_stack_nested_leaf | 0.1090ms | 59.6081μs | 16.7763 KOps/s | 16.2109 KOps/s | |
test_items_stack_nested_locked | 0.4108ms | 0.3642ms | 2.7456 KOps/s | 2.8090 KOps/s | |
test_keys | 27.3210μs | 3.4383μs | 290.8438 KOps/s | 290.3650 KOps/s | |
test_keys_nested | 0.1095ms | 80.9174μs | 12.3583 KOps/s | 12.2747 KOps/s | |
test_keys_nested_locked | 0.7334ms | 86.6380μs | 11.5423 KOps/s | 11.4507 KOps/s | |
test_keys_nested_leaf | 0.1131ms | 71.6819μs | 13.9505 KOps/s | 13.8900 KOps/s | |
test_keys_stack_nested | 0.1239ms | 81.5864μs | 12.2569 KOps/s | 12.1315 KOps/s | |
test_keys_stack_nested_leaf | 0.1227ms | 73.1953μs | 13.6621 KOps/s | 13.4952 KOps/s | |
test_keys_stack_nested_locked | 0.1485ms | 86.9836μs | 11.4964 KOps/s | 11.2804 KOps/s | |
test_values | 6.0417μs | 0.8493μs | 1.1774 MOps/s | 1.1815 MOps/s | |
test_values_nested | 67.7810μs | 34.5725μs | 28.9247 KOps/s | 29.1175 KOps/s | |
test_values_nested_locked | 64.4110μs | 35.6190μs | 28.0749 KOps/s | 27.5802 KOps/s | |
test_values_nested_leaf | 80.1110μs | 39.2807μs | 25.4578 KOps/s | 25.7134 KOps/s | |
test_values_stack_nested | 82.3010μs | 34.8922μs | 28.6597 KOps/s | 28.6792 KOps/s | |
test_values_stack_nested_leaf | 80.2720μs | 39.6768μs | 25.2036 KOps/s | 25.2472 KOps/s | |
test_values_stack_nested_locked | 85.1120μs | 36.2985μs | 27.5493 KOps/s | 27.4595 KOps/s | |
test_membership | 2.5381μs | 0.5091μs | 1.9641 MOps/s | 1.9502 MOps/s | |
test_membership_nested | 27.5410μs | 2.0736μs | 482.2471 KOps/s | 496.4883 KOps/s | |
test_membership_nested_leaf | 19.6355μs | 1.9605μs | 510.0618 KOps/s | 491.1853 KOps/s | |
test_membership_stacked_nested | 29.8910μs | 2.0670μs | 483.7925 KOps/s | 466.5242 KOps/s | |
test_membership_stacked_nested_leaf | 27.0200μs | 2.0462μs | 488.7039 KOps/s | 478.7313 KOps/s | |
test_membership_nested_last | 31.7400μs | 3.0447μs | 328.4439 KOps/s | 317.9668 KOps/s | |
test_membership_nested_leaf_last | 32.8300μs | 3.0958μs | 323.0196 KOps/s | 315.7166 KOps/s | |
test_membership_stacked_nested_last | 22.3300μs | 3.6266μs | 275.7423 KOps/s | 274.1322 KOps/s | |
test_membership_stacked_nested_leaf_last | 31.7210μs | 3.6066μs | 277.2694 KOps/s | 271.3986 KOps/s | |
test_nested_getleaf | 37.2710μs | 6.1585μs | 162.3778 KOps/s | 163.4539 KOps/s | |
test_nested_get | 52.5610μs | 5.6868μs | 175.8450 KOps/s | 170.8357 KOps/s | |
test_stacked_getleaf | 88.8720μs | 6.1240μs | 163.2919 KOps/s | 164.0851 KOps/s | |
test_stacked_get | 32.1810μs | 5.8771μs | 170.1507 KOps/s | 171.6579 KOps/s | |
test_nested_getitemleaf | 28.7500μs | 6.2077μs | 161.0910 KOps/s | 157.4166 KOps/s | |
test_nested_getitem | 26.5310μs | 6.0075μs | 166.4584 KOps/s | 164.8710 KOps/s | |
test_stacked_getitemleaf | 36.8710μs | 6.2210μs | 160.7448 KOps/s | 159.4680 KOps/s | |
test_stacked_getitem | 39.6000μs | 5.9352μs | 168.4876 KOps/s | 164.9751 KOps/s | |
test_lock_nested | 0.9056ms | 0.3746ms | 2.6698 KOps/s | 2.5553 KOps/s | |
test_lock_stack_nested | 0.3811ms | 0.3452ms | 2.8968 KOps/s | 2.8492 KOps/s | |
test_unlock_nested | 0.7712ms | 0.3159ms | 3.1656 KOps/s | 3.0832 KOps/s | |
test_unlock_stack_nested | 0.3274ms | 0.2825ms | 3.5393 KOps/s | 3.4419 KOps/s | |
test_flatten_speed | 0.1112ms | 75.5551μs | 13.2354 KOps/s | 13.3455 KOps/s | |
test_unflatten_speed | 0.3828ms | 0.3201ms | 3.1241 KOps/s | 3.0904 KOps/s | |
test_common_ops | 1.5448ms | 0.5900ms | 1.6948 KOps/s | 1.5169 KOps/s | |
test_creation | 0.1125ms | 1.7429μs | 573.7458 KOps/s | 573.1776 KOps/s | |
test_creation_empty | 32.3810μs | 6.9719μs | 143.4329 KOps/s | 97.9392 KOps/s | |
test_creation_nested_1 | 40.4410μs | 8.5656μs | 116.7463 KOps/s | 84.1334 KOps/s | |
test_creation_nested_2 | 45.9310μs | 11.3412μs | 88.1742 KOps/s | 68.8724 KOps/s | |
test_clone | 98.6320μs | 10.7496μs | 93.0266 KOps/s | 88.5847 KOps/s | |
test_getitem[int] | 1.3809ms | 10.8698μs | 91.9984 KOps/s | 88.3334 KOps/s | |
test_getitem[slice_int] | 0.1163ms | 21.1742μs | 47.2273 KOps/s | 45.8847 KOps/s | |
test_getitem[range] | 0.1662ms | 37.5908μs | 26.6023 KOps/s | 25.8476 KOps/s | |
test_getitem[tuple] | 0.1083ms | 18.2409μs | 54.8218 KOps/s | 53.3317 KOps/s | |
test_getitem[list] | 0.3338ms | 34.1131μs | 29.3142 KOps/s | 28.7900 KOps/s | |
test_setitem_dim[int] | 41.9510μs | 19.3234μs | 51.7508 KOps/s | 49.7707 KOps/s | |
test_setitem_dim[slice_int] | 63.0010μs | 39.3914μs | 25.3862 KOps/s | 26.3200 KOps/s | |
test_setitem_dim[range] | 80.7420μs | 53.9271μs | 18.5435 KOps/s | 18.6337 KOps/s | |
test_setitem_dim[tuple] | 55.7620μs | 32.8155μs | 30.4734 KOps/s | 30.3260 KOps/s | |
test_setitem | 0.1120ms | 14.3498μs | 69.6875 KOps/s | 59.5905 KOps/s | |
test_set | 0.1052ms | 13.9320μs | 71.7771 KOps/s | 60.2722 KOps/s | |
test_set_shared | 1.6281ms | 0.1537ms | 6.5061 KOps/s | 6.5235 KOps/s | |
test_update | 0.3200ms | 16.3776μs | 61.0592 KOps/s | 50.0020 KOps/s | |
test_update_nested | 0.1041ms | 21.4289μs | 46.6659 KOps/s | 37.9598 KOps/s | |
test_update__nested | 0.6004ms | 25.7824μs | 38.7861 KOps/s | 37.7256 KOps/s | |
test_set_nested | 0.1052ms | 15.2590μs | 65.5352 KOps/s | 56.7069 KOps/s | |
test_set_nested_new | 0.1094ms | 17.3387μs | 57.6746 KOps/s | 50.0871 KOps/s | |
test_select | 0.1145ms | 29.4270μs | 33.9824 KOps/s | 31.6969 KOps/s | |
test_select_nested | 0.1428ms | 43.1794μs | 23.1592 KOps/s | 22.9248 KOps/s | |
test_exclude_nested | 91.4520μs | 62.8634μs | 15.9075 KOps/s | 15.9699 KOps/s | |
test_empty[True] | 0.3667ms | 0.2895ms | 3.4545 KOps/s | 3.4600 KOps/s | |
test_empty[False] | 3.3531μs | 0.8330μs | 1.2005 MOps/s | 1.1985 MOps/s | |
test_to | 88.4820μs | 56.7011μs | 17.6363 KOps/s | 17.4906 KOps/s | |
test_to_nonblocking | 97.3020μs | 48.3255μs | 20.6930 KOps/s | 20.5311 KOps/s | |
test_unbind_speed | 0.2638ms | 0.2346ms | 4.2634 KOps/s | 4.1603 KOps/s | |
test_unbind_speed_stack0 | 0.3433ms | 0.2356ms | 4.2438 KOps/s | 4.1368 KOps/s | |
test_unbind_speed_stack1 | 92.8019ms | 0.6620ms | 1.5106 KOps/s | 1.4951 KOps/s | |
test_split | 93.4185ms | 1.6050ms | 623.0350 Ops/s | 620.4442 Ops/s | |
test_chunk | 93.5184ms | 1.6020ms | 624.2021 Ops/s | 615.4158 Ops/s | |
test_consolidate[False-None] | 96.2327ms | 2.9254ms | 341.8379 Ops/s | 337.4372 Ops/s | |
test_consolidate[default-None] | 1.8709ms | 1.7050ms | 586.4959 Ops/s | 575.1674 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8751ms | 1.6962ms | 589.5517 Ops/s | 565.7526 Ops/s | |
test_consolidate_njt[False-None] | 6.7364ms | 6.3987ms | 156.2816 Ops/s | 151.7089 Ops/s | |
test_to[False-False-None] | 2.1711ms | 1.7739ms | 563.7423 Ops/s | 568.2313 Ops/s | |
test_to[True-False-None] | 1.7241ms | 1.3233ms | 755.6777 Ops/s | 740.3055 Ops/s | |
test_to[within-False-None] | 4.4493ms | 4.0606ms | 246.2703 Ops/s | 240.6818 Ops/s | |
test_to[True-default-None] | 5.6240ms | 5.2266ms | 191.3295 Ops/s | 186.0363 Ops/s | |
test_to_njt[False-False-None] | 7.3805ms | 6.9233ms | 144.4397 Ops/s | 144.6801 Ops/s | |
test_to_njt[True-False-None] | 5.7382ms | 5.4026ms | 185.0977 Ops/s | 183.1095 Ops/s | |
test_to_njt[within-False-None] | 12.4462ms | 12.0345ms | 83.0944 Ops/s | 83.8068 Ops/s | |
test_creation[device0] | 0.6362ms | 81.5402μs | 12.2639 KOps/s | 12.2686 KOps/s | |
test_creation_from_tensor | 0.5068ms | 84.6081μs | 11.8192 KOps/s | 11.7170 KOps/s | |
test_add_one[memmap_tensor0] | 0.3273ms | 6.9208μs | 144.4910 KOps/s | 141.3887 KOps/s | |
test_contiguous[memmap_tensor0] | 2.4336μs | 0.4211μs | 2.3749 MOps/s | 2.3851 MOps/s | |
test_stack[memmap_tensor0] | 23.0000μs | 4.3580μs | 229.4615 KOps/s | 215.2638 KOps/s | |
test_memmaptd_index | 2.0759ms | 0.2540ms | 3.9370 KOps/s | 3.8290 KOps/s | |
test_memmaptd_index_astensor | 0.8913ms | 0.3170ms | 3.1543 KOps/s | 3.1137 KOps/s | |
test_memmaptd_index_op | 1.0150ms | 0.5731ms | 1.7448 KOps/s | 1.5716 KOps/s | |
test_serialize_model | 0.1313s | 0.1304s | 7.6711 Ops/s | 7.6455 Ops/s | |
test_serialize_model_pickle | 1.3504s | 1.2139s | 0.8238 Ops/s | 0.8247 Ops/s | |
test_serialize_weights | 0.1304s | 0.1294s | 7.7271 Ops/s | 7.6886 Ops/s | |
test_serialize_weights_returnearly | 0.3369s | 63.3585ms | 15.7832 Ops/s | 14.1798 Ops/s | |
test_serialize_weights_pickle | 1.3775s | 1.2182s | 0.8209 Ops/s | 0.8220 Ops/s | |
test_reshape_pytree | 72.1420μs | 21.9548μs | 45.5480 KOps/s | 43.4350 KOps/s | |
test_reshape_td | 83.7610μs | 26.4380μs | 37.8244 KOps/s | 34.7809 KOps/s | |
test_view_pytree | 86.9020μs | 21.3936μs | 46.7429 KOps/s | 42.8736 KOps/s | |
test_view_td | 65.4210μs | 29.6624μs | 33.7127 KOps/s | 29.2103 KOps/s | |
test_unbind_pytree | 59.6010μs | 28.0015μs | 35.7124 KOps/s | 33.7267 KOps/s | |
test_unbind_td | 0.7209ms | 35.7980μs | 27.9346 KOps/s | 25.6658 KOps/s | |
test_split_pytree | 61.1410μs | 29.0936μs | 34.3718 KOps/s | 31.0306 KOps/s | |
test_split_td | 0.9203ms | 37.7995μs | 26.4553 KOps/s | 24.1782 KOps/s | |
test_add_pytree | 77.6420μs | 34.4303μs | 29.0442 KOps/s | 27.3777 KOps/s | |
test_add_td | 98.9720μs | 47.1307μs | 21.2176 KOps/s | 17.5366 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1827ms | 0.1179ms | 8.4834 KOps/s | 8.0563 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2632ms | 0.1281ms | 7.8053 KOps/s | 7.5159 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1462ms | 94.3539μs | 10.5984 KOps/s | 10.0341 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3258ms | 0.1507ms | 6.6335 KOps/s | 6.4517 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 93.3120μs | 22.2595μs | 44.9246 KOps/s | 42.5976 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 68.6910μs | 29.3211μs | 34.1052 KOps/s | 33.4911 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4743ms | 64.7040μs | 15.4550 KOps/s | 15.2251 KOps/s | |
test_compile_copy_nested[pytree-eager] | 90.8820μs | 49.1614μs | 20.3412 KOps/s | 20.1244 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1893ms | 0.1382ms | 7.2374 KOps/s | 7.0009 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3311ms | 0.2158ms | 4.6339 KOps/s | 4.6333 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1492ms | 95.5356μs | 10.4673 KOps/s | 10.1196 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1676ms | 54.2442μs | 18.4351 KOps/s | 18.2232 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1714ms | 0.1314ms | 7.6122 KOps/s | 7.3157 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5398ms | 0.4963ms | 2.0148 KOps/s | 1.8766 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3731ms | 0.2568ms | 3.8945 KOps/s | 3.8445 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1798ms | 0.1388ms | 7.2062 KOps/s | 7.0580 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1680ms | 64.4012μs | 15.5277 KOps/s | 14.6257 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1366ms | 96.6066μs | 10.3513 KOps/s | 10.2451 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4722ms | 0.4164ms | 2.4016 KOps/s | 2.2472 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1701ms | 0.1327ms | 7.5360 KOps/s | 7.1332 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1310ms | 18.1222μs | 55.1808 KOps/s | 56.7885 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 64.1920μs | 31.7606μs | 31.4856 KOps/s | 32.0638 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1094ms | 69.8479μs | 14.3168 KOps/s | 14.1754 KOps/s | |
test_compile_copy_flat[pytree-eager] | 80.3320μs | 51.7027μs | 19.3413 KOps/s | 19.4081 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6595ms | 0.3976ms | 2.5154 KOps/s | 2.2264 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9115ms | 2.6535ms | 376.8631 Ops/s | 359.9594 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6040ms | 0.4333ms | 2.3077 KOps/s | 2.2004 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7937ms | 2.6971ms | 370.7701 Ops/s | 356.0138 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.6223ms | 0.1156ms | 8.6484 KOps/s | 8.3225 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5764ms | 81.4616μs | 12.2757 KOps/s | 11.7429 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.5217ms | 0.1119ms | 8.9346 KOps/s | 9.0681 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2003ms | 70.5298μs | 14.1784 KOps/s | 13.8163 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1652ms | 0.1158ms | 8.6365 KOps/s | 8.8227 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1135ms | 69.6358μs | 14.3604 KOps/s | 13.5475 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1594ms | 0.1053ms | 9.4991 KOps/s | 9.8420 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1432ms | 16.8935μs | 59.1944 KOps/s | 54.6191 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1714ms | 0.1011ms | 9.8911 KOps/s | 10.1704 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 66.3810μs | 15.4616μs | 64.6765 KOps/s | 60.3301 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1521ms | 0.1013ms | 9.8675 KOps/s | 10.1058 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 79.7520μs | 15.5561μs | 64.2836 KOps/s | 60.7713 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1550ms | 0.1069ms | 9.3529 KOps/s | 9.7444 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5754ms | 16.7895μs | 59.5610 KOps/s | 55.8740 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1647ms | 97.2282μs | 10.2851 KOps/s | 10.1195 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1163ms | 15.3653μs | 65.0819 KOps/s | 61.4494 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2238ms | 0.1014ms | 9.8619 KOps/s | 10.1071 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.4109ms | 15.6861μs | 63.7506 KOps/s | 61.1754 KOps/s | |
test_mod_add[eager] | 83.9910μs | 37.0493μs | 26.9911 KOps/s | 25.3344 KOps/s | |
test_mod_add[compile] | 0.2242ms | 78.1661μs | 12.7933 KOps/s | 12.3967 KOps/s | |
test_mod_add[compile-overhead] | 0.3241ms | 0.1658ms | 6.0331 KOps/s | 5.6811 KOps/s | |
test_mod_wrap[eager] | 0.3971ms | 0.2529ms | 3.9534 KOps/s | 3.6574 KOps/s | |
test_mod_wrap[compile] | 1.1139ms | 0.2839ms | 3.5221 KOps/s | 3.4423 KOps/s | |
test_mod_wrap[compile-overhead] | 6.5189ms | 3.6029ms | 277.5564 Ops/s | 272.9629 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.6530ms | 1.3804ms | 724.4530 Ops/s | 671.6130 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3820ms | 1.2757ms | 783.8707 Ops/s | 713.1238 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3724ms | 0.9285ms | 1.0770 KOps/s | 963.1642 Ops/s | |
test_seq_add[eager] | 0.2070ms | 0.1127ms | 8.8698 KOps/s | 8.0184 KOps/s | |
test_seq_add[compile] | 0.1471ms | 90.2877μs | 11.0757 KOps/s | 10.5502 KOps/s | |
test_seq_add[compile-overhead] | 0.1730ms | 0.1287ms | 7.7685 KOps/s | 7.3468 KOps/s | |
test_seq_wrap[eager] | 0.4913ms | 0.4177ms | 2.3939 KOps/s | 2.1773 KOps/s | |
test_seq_wrap[compile] | 0.3707ms | 0.2995ms | 3.3385 KOps/s | 3.2447 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2709ms | 0.2226ms | 4.4916 KOps/s | 4.3729 KOps/s | |
test_func_call_runtime[False-eager] | 0.8818ms | 0.7991ms | 1.2515 KOps/s | 1.3166 KOps/s | |
test_func_call_runtime[False-compile] | 0.8570ms | 0.7512ms | 1.3313 KOps/s | 1.2662 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4131ms | 0.3622ms | 2.7608 KOps/s | 2.7226 KOps/s | |
test_func_call_runtime[True-eager] | 0.9862ms | 0.9164ms | 1.0913 KOps/s | 1.0060 KOps/s | |
test_func_call_runtime[True-compile] | 0.8698ms | 0.7765ms | 1.2878 KOps/s | 1.2714 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4420ms | 0.3859ms | 2.5911 KOps/s | 2.5843 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8417ms | 0.7775ms | 1.2862 KOps/s | 1.2753 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8289ms | 0.7483ms | 1.3364 KOps/s | 1.3070 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4114ms | 0.3654ms | 2.7369 KOps/s | 2.6994 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1112ms | 1.0153ms | 984.9118 Ops/s | 978.6642 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8816ms | 0.7969ms | 1.2549 KOps/s | 1.2429 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4637ms | 0.4112ms | 2.4320 KOps/s | 2.4019 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6004ms | 2.1072ms | 474.5730 Ops/s | 469.3354 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9176ms | 0.8238ms | 1.2139 KOps/s | 1.2058 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4872ms | 0.4136ms | 2.4180 KOps/s | 2.3990 KOps/s | |
test_distributed | 2.8214ms | 0.3278ms | 3.0510 KOps/s | 7.7865 KOps/s | |
test_tdmodule | 0.2273ms | 19.4313μs | 51.4633 KOps/s | 47.9775 KOps/s | |
test_tdmodule_dispatch | 68.8110μs | 34.8563μs | 28.6892 KOps/s | 26.6671 KOps/s | |
test_tdseq | 31.0110μs | 19.7285μs | 50.6881 KOps/s | 45.7608 KOps/s | |
test_tdseq_dispatch | 58.9820μs | 36.1273μs | 27.6799 KOps/s | 24.2722 KOps/s | |
test_instantiation_functorch | 1.6729ms | 1.5358ms | 651.1392 Ops/s | 633.8500 Ops/s | |
test_exec_functorch | 0.2086ms | 0.1464ms | 6.8299 KOps/s | 6.3499 KOps/s | |
test_exec_functional_call | 0.1867ms | 0.1401ms | 7.1364 KOps/s | 6.7198 KOps/s | |
test_exec_td_decorator | 0.3824ms | 0.1880ms | 5.3201 KOps/s | 5.2827 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7773ms | 0.6906ms | 1.4480 KOps/s | 1.4093 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8486ms | 0.6890ms | 1.4514 KOps/s | 1.4026 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7217ms | 0.6006ms | 1.6649 KOps/s | 1.5910 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7192ms | 0.6022ms | 1.6605 KOps/s | 1.6483 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.6092ms | 19.4771ms | 51.3424 Ops/s | 51.3120 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.2600ms | 19.6257ms | 50.9535 Ops/s | 51.4737 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.5435ms | 19.4431ms | 51.4322 Ops/s | 51.8253 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.4893ms | 19.4349ms | 51.4537 Ops/s | 51.6833 Ops/s | |
test_to_module_speed[True] | 1.1115ms | 0.9711ms | 1.0298 KOps/s | 1.0305 KOps/s | |
test_to_module_speed[False] | 1.0727ms | 0.9435ms | 1.0599 KOps/s | 1.0448 KOps/s | |
test_tc_init | 68.2320μs | 34.6385μs | 28.8696 KOps/s | 27.0585 KOps/s | |
test_tc_init_nested | 0.1141ms | 70.7381μs | 14.1367 KOps/s | 13.3625 KOps/s | |
test_tc_first_layer_tensor | 29.7000μs | 0.8187μs | 1.2214 MOps/s | 1.2051 MOps/s | |
test_tc_first_layer_nontensor | 28.4010μs | 2.2884μs | 436.9899 KOps/s | 437.4432 KOps/s | |
test_tc_second_layer_tensor | 8.2325μs | 1.4146μs | 706.8946 KOps/s | 689.7571 KOps/s | |
test_tc_second_layer_nontensor | 25.8500μs | 3.0245μs | 330.6293 KOps/s | 325.6972 KOps/s | |
test_unbind | 7.2682ms | 6.9924ms | 143.0129 Ops/s | 144.6816 Ops/s | |
test_full_like | 9.2198ms | 9.1117ms | 109.7485 Ops/s | 108.5052 Ops/s | |
test_zeros_like | 4.8379ms | 4.3228ms | 231.3333 Ops/s | 230.6655 Ops/s | |
test_ones_like | 5.3750ms | 4.3241ms | 231.2608 Ops/s | 231.1621 Ops/s | |
test_clone | 6.4720ms | 6.3586ms | 157.2664 Ops/s | 156.4724 Ops/s | |
test_squeeze | 82.1320μs | 9.7365μs | 102.7068 KOps/s | 106.1526 KOps/s | |
test_unsqueeze | 0.1234ms | 73.1868μs | 13.6637 KOps/s | 14.0405 KOps/s | |
test_split | 0.3311ms | 0.1552ms | 6.4449 KOps/s | 6.2482 KOps/s | |
test_permute | 0.3264ms | 0.1796ms | 5.5681 KOps/s | 5.4937 KOps/s | |
test_stack | 50.8467ms | 50.4588ms | 19.8182 Ops/s | 20.0407 Ops/s | |
test_cat | 50.7124ms | 50.3999ms | 19.8413 Ops/s | 19.9783 Ops/s |
vmoens
added a commit
that referenced
this pull request
Jan 7, 2025
ghstack-source-id: 2d117645769890b72f5856f68acbe1b48015cfbb Pull Request resolved: #1164
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):