-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Compile - tensorclass compatibility #882
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Jul 12, 2024
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jul 12, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 41.1170μs | 18.5890μs | 53.7952 KOps/s | 55.4261 KOps/s | |
test_plain_set_stack_nested | 47.0280μs | 18.9038μs | 52.8994 KOps/s | 50.4211 KOps/s | |
test_plain_set_nested_inplace | 67.2750μs | 20.6969μs | 48.3165 KOps/s | 49.1865 KOps/s | |
test_plain_set_stack_nested_inplace | 72.6140μs | 20.5495μs | 48.6629 KOps/s | 49.6198 KOps/s | |
test_items | 16.9610μs | 2.6396μs | 378.8460 KOps/s | 392.0118 KOps/s | |
test_items_nested | 2.2345ms | 0.3730ms | 2.6809 KOps/s | 2.6540 KOps/s | |
test_items_nested_locked | 0.5305ms | 0.3706ms | 2.6981 KOps/s | 2.7264 KOps/s | |
test_items_nested_leaf | 0.1603ms | 85.7794μs | 11.6578 KOps/s | 11.5462 KOps/s | |
test_items_stack_nested | 0.5487ms | 0.3754ms | 2.6640 KOps/s | 2.7268 KOps/s | |
test_items_stack_nested_leaf | 0.1605ms | 87.0092μs | 11.4930 KOps/s | 11.6208 KOps/s | |
test_items_stack_nested_locked | 0.8169ms | 0.3790ms | 2.6385 KOps/s | 2.7395 KOps/s | |
test_keys | 0.1105ms | 4.2346μs | 236.1494 KOps/s | 217.2154 KOps/s | |
test_keys_nested | 0.2506ms | 0.1457ms | 6.8618 KOps/s | 6.9305 KOps/s | |
test_keys_nested_locked | 0.6740ms | 0.1511ms | 6.6198 KOps/s | 6.5846 KOps/s | |
test_keys_nested_leaf | 0.2213ms | 0.1233ms | 8.1131 KOps/s | 8.1110 KOps/s | |
test_keys_stack_nested | 0.2339ms | 0.1448ms | 6.9062 KOps/s | 6.8977 KOps/s | |
test_keys_stack_nested_leaf | 0.2229ms | 0.1232ms | 8.1146 KOps/s | 8.1304 KOps/s | |
test_keys_stack_nested_locked | 0.4618ms | 0.1503ms | 6.6517 KOps/s | 6.6377 KOps/s | |
test_values | 10.6675μs | 1.1817μs | 846.2389 KOps/s | 888.8857 KOps/s | |
test_values_nested | 0.1070ms | 49.7063μs | 20.1182 KOps/s | 20.2426 KOps/s | |
test_values_nested_locked | 0.1154ms | 49.2208μs | 20.3166 KOps/s | 19.8622 KOps/s | |
test_values_nested_leaf | 0.1155ms | 44.7747μs | 22.3341 KOps/s | 22.4624 KOps/s | |
test_values_stack_nested | 0.1022ms | 51.2425μs | 19.5151 KOps/s | 20.2883 KOps/s | |
test_values_stack_nested_leaf | 85.9000μs | 44.3935μs | 22.5258 KOps/s | 22.4965 KOps/s | |
test_values_stack_nested_locked | 98.8240μs | 51.4953μs | 19.4193 KOps/s | 20.3545 KOps/s | |
test_membership | 26.9700μs | 0.8916μs | 1.1215 MOps/s | 1.3998 MOps/s | |
test_membership_nested | 30.8380μs | 2.6648μs | 375.2613 KOps/s | 363.4403 KOps/s | |
test_membership_nested_leaf | 21.0890μs | 2.7166μs | 368.1136 KOps/s | 366.9289 KOps/s | |
test_membership_stacked_nested | 23.7640μs | 2.6530μs | 376.9310 KOps/s | 363.5982 KOps/s | |
test_membership_stacked_nested_leaf | 18.0840μs | 2.6939μs | 371.2123 KOps/s | 368.7125 KOps/s | |
test_membership_nested_last | 36.2770μs | 3.9581μs | 252.6476 KOps/s | 249.5506 KOps/s | |
test_membership_nested_leaf_last | 19.7160μs | 3.9881μs | 250.7467 KOps/s | 249.7668 KOps/s | |
test_membership_stacked_nested_last | 31.2680μs | 4.5466μs | 219.9431 KOps/s | 251.7478 KOps/s | |
test_membership_stacked_nested_leaf_last | 52.2970μs | 4.5843μs | 218.1373 KOps/s | 249.1957 KOps/s | |
test_nested_getleaf | 42.0590μs | 10.9642μs | 91.2061 KOps/s | 92.3531 KOps/s | |
test_nested_get | 50.1330μs | 10.4249μs | 95.9239 KOps/s | 98.5080 KOps/s | |
test_stacked_getleaf | 36.5580μs | 10.8997μs | 91.7457 KOps/s | 93.9805 KOps/s | |
test_stacked_get | 35.0760μs | 10.2582μs | 97.4831 KOps/s | 98.9593 KOps/s | |
test_nested_getitemleaf | 30.4870μs | 11.3080μs | 88.4330 KOps/s | 88.7168 KOps/s | |
test_nested_getitem | 0.1193ms | 10.5966μs | 94.3696 KOps/s | 96.8241 KOps/s | |
test_stacked_getitemleaf | 32.7000μs | 11.3130μs | 88.3941 KOps/s | 88.8381 KOps/s | |
test_stacked_getitem | 40.2950μs | 10.3639μs | 96.4889 KOps/s | 96.7847 KOps/s | |
test_lock_nested | 0.8743ms | 0.4633ms | 2.1584 KOps/s | 2.1353 KOps/s | |
test_lock_stack_nested | 0.9225ms | 0.4314ms | 2.3183 KOps/s | 2.3052 KOps/s | |
test_unlock_nested | 0.8646ms | 0.3837ms | 2.6061 KOps/s | 2.2068 KOps/s | |
test_unlock_stack_nested | 0.6887ms | 0.3438ms | 2.9087 KOps/s | 2.8587 KOps/s | |
test_flatten_speed | 0.6324ms | 0.1051ms | 9.5174 KOps/s | 9.5568 KOps/s | |
test_unflatten_speed | 1.0108ms | 0.4433ms | 2.2556 KOps/s | 2.2891 KOps/s | |
test_common_ops | 4.8669ms | 0.8314ms | 1.2028 KOps/s | 1.2428 KOps/s | |
test_creation | 0.1032ms | 2.4176μs | 413.6366 KOps/s | 432.9047 KOps/s | |
test_creation_empty | 45.8260μs | 13.2240μs | 75.6202 KOps/s | 78.9336 KOps/s | |
test_creation_nested_1 | 58.1780μs | 16.4242μs | 60.8857 KOps/s | 62.5851 KOps/s | |
test_creation_nested_2 | 67.8560μs | 20.2307μs | 49.4298 KOps/s | 51.1970 KOps/s | |
test_clone | 0.1071ms | 13.0327μs | 76.7302 KOps/s | 74.7032 KOps/s | |
test_getitem[int] | 42.6290μs | 11.7089μs | 85.4050 KOps/s | 85.6745 KOps/s | |
test_getitem[slice_int] | 59.6210μs | 24.1382μs | 41.4280 KOps/s | 42.4525 KOps/s | |
test_getitem[range] | 0.2802ms | 47.2438μs | 21.1668 KOps/s | 22.3256 KOps/s | |
test_getitem[tuple] | 50.0130μs | 19.8379μs | 50.4086 KOps/s | 51.9597 KOps/s | |
test_getitem[list] | 0.3610ms | 41.1971μs | 24.2735 KOps/s | 24.8605 KOps/s | |
test_setitem_dim[int] | 86.5710μs | 36.6883μs | 27.2567 KOps/s | 28.6899 KOps/s | |
test_setitem_dim[slice_int] | 0.1262ms | 64.0246μs | 15.6190 KOps/s | 15.8952 KOps/s | |
test_setitem_dim[range] | 0.1300ms | 84.1439μs | 11.8844 KOps/s | 12.0544 KOps/s | |
test_setitem_dim[tuple] | 87.9340μs | 52.1819μs | 19.1637 KOps/s | 19.3380 KOps/s | |
test_setitem | 0.1175ms | 21.0265μs | 47.5591 KOps/s | 46.9194 KOps/s | |
test_set | 0.1265ms | 20.6264μs | 48.4815 KOps/s | 48.4159 KOps/s | |
test_set_shared | 1.6311ms | 0.1673ms | 5.9786 KOps/s | 5.9238 KOps/s | |
test_update | 0.1879ms | 24.4710μs | 40.8648 KOps/s | 41.2466 KOps/s | |
test_update_nested | 0.1455ms | 33.3237μs | 30.0087 KOps/s | 29.8121 KOps/s | |
test_update__nested | 0.1106ms | 25.1612μs | 39.7437 KOps/s | 39.5094 KOps/s | |
test_set_nested | 0.1531ms | 22.7857μs | 43.8873 KOps/s | 43.8483 KOps/s | |
test_set_nested_new | 0.1343ms | 27.5675μs | 36.2747 KOps/s | 36.6207 KOps/s | |
test_select | 0.1674ms | 43.4993μs | 22.9889 KOps/s | 23.4945 KOps/s | |
test_select_nested | 0.1213ms | 60.8845μs | 16.4246 KOps/s | 16.4609 KOps/s | |
test_exclude_nested | 0.1812ms | 80.6000μs | 12.4070 KOps/s | 12.4216 KOps/s | |
test_empty[True] | 0.4670ms | 0.3469ms | 2.8827 KOps/s | 2.9137 KOps/s | |
test_empty[False] | 7.4538μs | 1.2636μs | 791.4072 KOps/s | 797.4495 KOps/s | |
test_unbind_speed | 0.5084ms | 0.2823ms | 3.5425 KOps/s | 3.5615 KOps/s | |
test_unbind_speed_stack0 | 0.4246ms | 0.2717ms | 3.6807 KOps/s | 3.5543 KOps/s | |
test_unbind_speed_stack1 | 79.2189ms | 0.7640ms | 1.3090 KOps/s | 1.2839 KOps/s | |
test_split | 76.2580ms | 1.6504ms | 605.9219 Ops/s | 670.2860 Ops/s | |
test_chunk | 77.6195ms | 1.6599ms | 602.4580 Ops/s | 619.5921 Ops/s | |
test_creation[device0] | 0.2082ms | 92.7146μs | 10.7858 KOps/s | 10.4681 KOps/s | |
test_creation_from_tensor | 4.0686ms | 95.6449μs | 10.4553 KOps/s | 10.2115 KOps/s | |
test_add_one[memmap_tensor0] | 0.1795ms | 5.5595μs | 179.8716 KOps/s | 182.8270 KOps/s | |
test_contiguous[memmap_tensor0] | 21.7510μs | 0.6325μs | 1.5810 MOps/s | 1.5928 MOps/s | |
test_stack[memmap_tensor0] | 44.0820μs | 3.6928μs | 270.8004 KOps/s | 276.1779 KOps/s | |
test_memmaptd_index | 1.0625ms | 0.2645ms | 3.7806 KOps/s | 3.8825 KOps/s | |
test_memmaptd_index_astensor | 0.5856ms | 0.3350ms | 2.9847 KOps/s | 3.0108 KOps/s | |
test_memmaptd_index_op | 0.9146ms | 0.6499ms | 1.5386 KOps/s | 1.5548 KOps/s | |
test_serialize_model | 0.1275s | 0.1223s | 8.1759 Ops/s | 7.1484 Ops/s | |
test_serialize_model_pickle | 0.4453s | 0.3901s | 2.5633 Ops/s | 2.4871 Ops/s | |
test_serialize_weights | 0.1967s | 0.1341s | 7.4597 Ops/s | 8.0480 Ops/s | |
test_serialize_weights_returnearly | 0.1862s | 0.1699s | 5.8855 Ops/s | 5.6251 Ops/s | |
test_serialize_weights_pickle | 0.4779s | 0.4128s | 2.4225 Ops/s | 2.3751 Ops/s | |
test_serialize_weights_filesystem | 0.1475s | 0.1435s | 6.9699 Ops/s | 7.0671 Ops/s | |
test_serialize_model_filesystem | 0.1537s | 0.1507s | 6.6370 Ops/s | 6.5676 Ops/s | |
test_reshape_pytree | 95.1870μs | 25.4823μs | 39.2429 KOps/s | 38.5837 KOps/s | |
test_reshape_td | 0.1244ms | 34.3210μs | 29.1366 KOps/s | 28.7214 KOps/s | |
test_view_pytree | 76.5020μs | 25.6813μs | 38.9388 KOps/s | 38.3590 KOps/s | |
test_view_td | 91.1700μs | 39.1584μs | 25.5373 KOps/s | 24.6672 KOps/s | |
test_unbind_pytree | 0.1011ms | 29.6094μs | 33.7730 KOps/s | 33.8963 KOps/s | |
test_unbind_td | 0.3591ms | 41.2604μs | 24.2363 KOps/s | 24.1274 KOps/s | |
test_split_pytree | 80.4100μs | 29.7283μs | 33.6380 KOps/s | 33.8553 KOps/s | |
test_split_td | 0.5098ms | 42.1016μs | 23.7521 KOps/s | 24.1775 KOps/s | |
test_add_pytree | 78.5760μs | 35.3420μs | 28.2950 KOps/s | 28.4040 KOps/s | |
test_add_td | 0.1335ms | 60.2769μs | 16.5901 KOps/s | 16.7226 KOps/s | |
test_distributed | 0.2660ms | 0.1295ms | 7.7208 KOps/s | 7.4305 KOps/s | |
test_tdmodule | 85.8800μs | 17.8089μs | 56.1516 KOps/s | 57.2739 KOps/s | |
test_tdmodule_dispatch | 69.1790μs | 38.2876μs | 26.1181 KOps/s | 27.8117 KOps/s | |
test_tdseq | 49.0720μs | 19.9053μs | 50.2378 KOps/s | 51.2352 KOps/s | |
test_tdseq_dispatch | 71.5230μs | 42.3573μs | 23.6087 KOps/s | 24.6007 KOps/s | |
test_instantiation_functorch | 2.0638ms | 1.3264ms | 753.9160 Ops/s | 737.8022 Ops/s | |
test_instantiation_td | 2.4848ms | 1.0271ms | 973.6399 Ops/s | 887.9638 Ops/s | |
test_exec_functorch | 0.3898ms | 0.1718ms | 5.8203 KOps/s | 6.1583 KOps/s | |
test_exec_functional_call | 0.2894ms | 0.1479ms | 6.7600 KOps/s | 6.4803 KOps/s | |
test_exec_td | 0.2791ms | 0.1531ms | 6.5320 KOps/s | 6.7685 KOps/s | |
test_exec_td_decorator | 0.6850ms | 0.2356ms | 4.2454 KOps/s | 4.2881 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7499ms | 0.5006ms | 1.9977 KOps/s | 2.0141 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7471ms | 0.4962ms | 2.0155 KOps/s | 2.0369 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6883ms | 0.3987ms | 2.5082 KOps/s | 2.4655 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6996ms | 0.4007ms | 2.4956 KOps/s | 2.4697 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2531ms | 0.5851ms | 1.7092 KOps/s | 1.7126 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7661ms | 0.5817ms | 1.7191 KOps/s | 1.7084 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7909ms | 0.4775ms | 2.0941 KOps/s | 2.0883 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7379ms | 0.4742ms | 2.1090 KOps/s | 2.0743 KOps/s | |
test_to_module_speed[True] | 1.9603ms | 1.8133ms | 551.4772 Ops/s | 546.3937 Ops/s | |
test_to_module_speed[False] | 2.3258ms | 1.7822ms | 561.1191 Ops/s | 550.6449 Ops/s | |
test_tc_init | 0.1134ms | 45.7457μs | 21.8600 KOps/s | 25.8977 KOps/s | |
test_tc_init_nested | 0.1699ms | 91.1193μs | 10.9746 KOps/s | 12.8109 KOps/s | |
test_tc_first_layer_tensor | 58.7290μs | 9.2043μs | 108.6445 KOps/s | 120.8621 KOps/s | |
test_tc_first_layer_nontensor | 32.8110μs | 9.1637μs | 109.1260 KOps/s | 120.4470 KOps/s | |
test_tc_second_layer_tensor | 41.1960μs | 2.8368μs | 352.5154 KOps/s | 391.6604 KOps/s | |
test_tc_second_layer_nontensor | 51.9470μs | 10.2739μs | 97.3342 KOps/s | 106.2061 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 78.4920μs | 12.5118μs | 79.9245 KOps/s | 75.7568 KOps/s | |
test_plain_set_stack_nested | 0.1035ms | 12.6947μs | 78.7730 KOps/s | 76.0204 KOps/s | |
test_plain_set_nested_inplace | 42.1010μs | 13.6621μs | 73.1953 KOps/s | 70.3745 KOps/s | |
test_plain_set_stack_nested_inplace | 31.1900μs | 13.6561μs | 73.2273 KOps/s | 70.5577 KOps/s | |
test_items | 21.7810μs | 4.7603μs | 210.0729 KOps/s | 209.3041 KOps/s | |
test_items_nested | 0.5054ms | 0.4024ms | 2.4848 KOps/s | 2.5246 KOps/s | |
test_items_nested_locked | 0.5935ms | 0.4034ms | 2.4787 KOps/s | 2.4946 KOps/s | |
test_items_nested_leaf | 0.2345ms | 85.8378μs | 11.6499 KOps/s | 11.5173 KOps/s | |
test_items_stack_nested | 0.4519ms | 0.4013ms | 2.4917 KOps/s | 2.5135 KOps/s | |
test_items_stack_nested_leaf | 0.2714ms | 86.2013μs | 11.6008 KOps/s | 11.4087 KOps/s | |
test_items_stack_nested_locked | 0.4715ms | 0.4030ms | 2.4813 KOps/s | 2.4902 KOps/s | |
test_keys | 17.4410μs | 4.3696μs | 228.8539 KOps/s | 227.8169 KOps/s | |
test_keys_nested | 0.1047ms | 67.3765μs | 14.8420 KOps/s | 14.4676 KOps/s | |
test_keys_nested_locked | 0.7625ms | 73.1832μs | 13.6643 KOps/s | 13.4555 KOps/s | |
test_keys_nested_leaf | 87.1720μs | 57.5700μs | 17.3702 KOps/s | 17.3422 KOps/s | |
test_keys_stack_nested | 0.1009ms | 67.8143μs | 14.7462 KOps/s | 14.8907 KOps/s | |
test_keys_stack_nested_leaf | 95.5110μs | 56.5338μs | 17.6885 KOps/s | 17.1243 KOps/s | |
test_keys_stack_nested_locked | 0.1269ms | 71.6087μs | 13.9648 KOps/s | 13.5747 KOps/s | |
test_values | 7.4733μs | 1.7767μs | 562.8457 KOps/s | 560.1742 KOps/s | |
test_values_nested | 54.3910μs | 33.8827μs | 29.5136 KOps/s | 28.7828 KOps/s | |
test_values_nested_locked | 70.5320μs | 35.7615μs | 27.9630 KOps/s | 27.1477 KOps/s | |
test_values_nested_leaf | 0.1370ms | 30.0789μs | 33.2459 KOps/s | 32.1363 KOps/s | |
test_values_stack_nested | 0.1301ms | 34.0329μs | 29.3833 KOps/s | 28.1075 KOps/s | |
test_values_stack_nested_leaf | 51.9510μs | 30.4465μs | 32.8445 KOps/s | 31.6257 KOps/s | |
test_values_stack_nested_locked | 0.1626ms | 36.0327μs | 27.7526 KOps/s | 26.8645 KOps/s | |
test_membership | 1.8350μs | 0.5373μs | 1.8611 MOps/s | 1.8816 MOps/s | |
test_membership_nested | 17.0200μs | 2.0761μs | 481.6732 KOps/s | 484.0788 KOps/s | |
test_membership_nested_leaf | 9.5950μs | 2.0149μs | 496.3148 KOps/s | 498.0365 KOps/s | |
test_membership_stacked_nested | 17.9500μs | 2.0907μs | 478.2996 KOps/s | 477.3000 KOps/s | |
test_membership_stacked_nested_leaf | 16.0300μs | 2.0880μs | 478.9353 KOps/s | 483.8790 KOps/s | |
test_membership_nested_last | 17.1000μs | 3.0155μs | 331.6151 KOps/s | 334.0297 KOps/s | |
test_membership_nested_leaf_last | 0.2009ms | 2.9681μs | 336.9135 KOps/s | 331.8967 KOps/s | |
test_membership_stacked_nested_last | 72.0410μs | 3.4183μs | 292.5401 KOps/s | 291.0557 KOps/s | |
test_membership_stacked_nested_leaf_last | 98.9220μs | 3.4236μs | 292.0919 KOps/s | 292.5499 KOps/s | |
test_nested_getleaf | 33.2410μs | 8.0239μs | 124.6283 KOps/s | 123.9154 KOps/s | |
test_nested_get | 19.5210μs | 7.5773μs | 131.9738 KOps/s | 131.9285 KOps/s | |
test_stacked_getleaf | 33.8000μs | 8.0396μs | 124.3845 KOps/s | 123.8000 KOps/s | |
test_stacked_get | 22.1310μs | 7.5415μs | 132.5992 KOps/s | 132.5021 KOps/s | |
test_nested_getitemleaf | 22.8100μs | 8.1945μs | 122.0336 KOps/s | 122.1271 KOps/s | |
test_nested_getitem | 21.6100μs | 7.6999μs | 129.8716 KOps/s | 130.0097 KOps/s | |
test_stacked_getitemleaf | 32.6010μs | 8.2287μs | 121.5251 KOps/s | 122.3144 KOps/s | |
test_stacked_getitem | 23.0610μs | 7.6868μs | 130.0930 KOps/s | 130.3581 KOps/s | |
test_lock_nested | 9.6803ms | 0.4293ms | 2.3294 KOps/s | 2.3519 KOps/s | |
test_lock_stack_nested | 0.4621ms | 0.3890ms | 2.5707 KOps/s | 2.5536 KOps/s | |
test_unlock_nested | 0.8300ms | 0.3428ms | 2.9173 KOps/s | 2.8960 KOps/s | |
test_unlock_stack_nested | 0.3727ms | 0.3105ms | 3.2208 KOps/s | 3.2075 KOps/s | |
test_flatten_speed | 0.3933ms | 0.1057ms | 9.4618 KOps/s | 9.4876 KOps/s | |
test_unflatten_speed | 0.5006ms | 0.2921ms | 3.4229 KOps/s | 3.4340 KOps/s | |
test_common_ops | 0.9497ms | 0.5658ms | 1.7674 KOps/s | 1.7029 KOps/s | |
test_creation | 35.6610μs | 1.9506μs | 512.6646 KOps/s | 538.7516 KOps/s | |
test_creation_empty | 29.2600μs | 8.7541μs | 114.2317 KOps/s | 102.0831 KOps/s | |
test_creation_nested_1 | 30.9100μs | 10.6353μs | 94.0269 KOps/s | 84.8531 KOps/s | |
test_creation_nested_2 | 28.4600μs | 13.0172μs | 76.8217 KOps/s | 70.9614 KOps/s | |
test_clone | 91.4620μs | 11.0417μs | 90.5658 KOps/s | 88.6636 KOps/s | |
test_getitem[int] | 25.5300μs | 10.2578μs | 97.4864 KOps/s | 97.3175 KOps/s | |
test_getitem[slice_int] | 0.1213ms | 20.2695μs | 49.3351 KOps/s | 49.4720 KOps/s | |
test_getitem[range] | 0.2435ms | 37.8344μs | 26.4310 KOps/s | 26.5985 KOps/s | |
test_getitem[tuple] | 36.4500μs | 17.7440μs | 56.3571 KOps/s | 57.0150 KOps/s | |
test_getitem[list] | 0.2655ms | 32.3529μs | 30.9091 KOps/s | 30.7658 KOps/s | |
test_setitem_dim[int] | 42.1310μs | 23.6833μs | 42.2239 KOps/s | 38.2694 KOps/s | |
test_setitem_dim[slice_int] | 73.6410μs | 45.1344μs | 22.1560 KOps/s | 20.8670 KOps/s | |
test_setitem_dim[range] | 0.1061ms | 62.0848μs | 16.1070 KOps/s | 15.5197 KOps/s | |
test_setitem_dim[tuple] | 0.1467ms | 39.2671μs | 25.4666 KOps/s | 24.5486 KOps/s | |
test_setitem | 90.1010μs | 15.7266μs | 63.5866 KOps/s | 61.2210 KOps/s | |
test_set | 99.3320μs | 15.1360μs | 66.0676 KOps/s | 63.8574 KOps/s | |
test_set_shared | 3.0625ms | 0.1014ms | 9.8581 KOps/s | 10.2343 KOps/s | |
test_update | 98.2620μs | 17.4419μs | 57.3331 KOps/s | 52.0450 KOps/s | |
test_update_nested | 0.1090ms | 23.1195μs | 43.2535 KOps/s | 41.7386 KOps/s | |
test_update__nested | 97.7520μs | 20.4971μs | 48.7874 KOps/s | 46.7755 KOps/s | |
test_set_nested | 0.1086ms | 15.9571μs | 62.6682 KOps/s | 59.8245 KOps/s | |
test_set_nested_new | 99.1820μs | 18.8219μs | 53.1297 KOps/s | 51.7153 KOps/s | |
test_select | 0.1159ms | 31.4979μs | 31.7481 KOps/s | 30.1584 KOps/s | |
test_select_nested | 92.2120μs | 52.7568μs | 18.9549 KOps/s | 19.0010 KOps/s | |
test_exclude_nested | 0.1429ms | 72.4181μs | 13.8087 KOps/s | 13.7605 KOps/s | |
test_empty[True] | 0.4063ms | 0.2985ms | 3.3505 KOps/s | 3.3351 KOps/s | |
test_empty[False] | 16.9113μs | 0.9355μs | 1.0690 MOps/s | 1.0822 MOps/s | |
test_to | 89.1230μs | 59.1288μs | 16.9122 KOps/s | 17.0539 KOps/s | |
test_to_nonblocking | 0.1840ms | 36.8469μs | 27.1393 KOps/s | 26.4599 KOps/s | |
test_unbind_speed | 0.3283ms | 0.2638ms | 3.7906 KOps/s | 3.7153 KOps/s | |
test_unbind_speed_stack0 | 0.3610ms | 0.2634ms | 3.7966 KOps/s | 3.7670 KOps/s | |
test_unbind_speed_stack1 | 94.3900ms | 0.8021ms | 1.2467 KOps/s | 1.2368 KOps/s | |
test_split | 93.1856ms | 1.5563ms | 642.5634 Ops/s | 627.1538 Ops/s | |
test_chunk | 1.5399ms | 1.4148ms | 706.8164 Ops/s | 691.2008 Ops/s | |
test_creation[device0] | 0.1896ms | 55.4836μs | 18.0233 KOps/s | 17.2059 KOps/s | |
test_creation_from_tensor | 0.1948ms | 55.2455μs | 18.1010 KOps/s | 18.2142 KOps/s | |
test_add_one[memmap_tensor0] | 97.6820μs | 6.8050μs | 146.9501 KOps/s | 145.9794 KOps/s | |
test_contiguous[memmap_tensor0] | 11.7800μs | 0.6052μs | 1.6524 MOps/s | 1.7039 MOps/s | |
test_stack[memmap_tensor0] | 32.2910μs | 4.3144μs | 231.7793 KOps/s | 233.3979 KOps/s | |
test_memmaptd_index | 1.0943ms | 0.2548ms | 3.9250 KOps/s | 3.7821 KOps/s | |
test_memmaptd_index_astensor | 0.6243ms | 0.3188ms | 3.1366 KOps/s | 3.0705 KOps/s | |
test_memmaptd_index_op | 94.1474ms | 0.6546ms | 1.5277 KOps/s | 1.6012 KOps/s | |
test_serialize_model | 94.2990ms | 90.2210ms | 11.0839 Ops/s | 10.3109 Ops/s | |
test_serialize_model_pickle | 1.3480s | 1.2351s | 0.8096 Ops/s | 0.7186 Ops/s | |
test_serialize_weights | 0.1864s | 99.6104ms | 10.0391 Ops/s | 10.7579 Ops/s | |
test_serialize_weights_returnearly | 0.2958s | 79.5005ms | 12.5785 Ops/s | 13.8962 Ops/s | |
test_serialize_weights_pickle | 1.3523s | 1.2487s | 0.8008 Ops/s | 0.8010 Ops/s | |
test_reshape_pytree | 0.2389ms | 25.3708μs | 39.4154 KOps/s | 39.2822 KOps/s | |
test_reshape_td | 99.1620μs | 30.7905μs | 32.4775 KOps/s | 32.8166 KOps/s | |
test_view_pytree | 0.1437ms | 25.0124μs | 39.9802 KOps/s | 39.3006 KOps/s | |
test_view_td | 0.2455ms | 37.9981μs | 26.3171 KOps/s | 27.3142 KOps/s | |
test_unbind_pytree | 0.1770ms | 32.2356μs | 31.0216 KOps/s | 32.3281 KOps/s | |
test_unbind_td | 0.6119ms | 41.9748μs | 23.8238 KOps/s | 25.1636 KOps/s | |
test_split_pytree | 63.5910μs | 35.4142μs | 28.2373 KOps/s | 29.3353 KOps/s | |
test_split_td | 0.2516ms | 39.8755μs | 25.0781 KOps/s | 26.6280 KOps/s | |
test_add_pytree | 0.1660ms | 37.1620μs | 26.9092 KOps/s | 26.4577 KOps/s | |
test_add_td | 0.2472ms | 46.9774μs | 21.2868 KOps/s | 20.1175 KOps/s | |
test_distributed | 0.2681ms | 74.0026μs | 13.5130 KOps/s | 14.6156 KOps/s | |
test_tdmodule | 59.6210μs | 14.2998μs | 69.9310 KOps/s | 66.7691 KOps/s | |
test_tdmodule_dispatch | 45.5110μs | 29.6399μs | 33.7383 KOps/s | 35.3126 KOps/s | |
test_tdseq | 0.1008ms | 15.3914μs | 64.9715 KOps/s | 65.2193 KOps/s | |
test_tdseq_dispatch | 56.8810μs | 32.2853μs | 30.9739 KOps/s | 31.6974 KOps/s | |
test_instantiation_functorch | 1.6262ms | 1.3770ms | 726.2266 Ops/s | 721.3382 Ops/s | |
test_instantiation_td | 1.4452ms | 0.9739ms | 1.0268 KOps/s | 914.9997 Ops/s | |
test_exec_functorch | 0.2546ms | 0.1444ms | 6.9269 KOps/s | 6.7972 KOps/s | |
test_exec_functional_call | 0.3243ms | 0.1291ms | 7.7443 KOps/s | 7.5294 KOps/s | |
test_exec_td | 0.1567ms | 0.1258ms | 7.9484 KOps/s | 7.5727 KOps/s | |
test_exec_td_decorator | 0.6016ms | 0.1965ms | 5.0883 KOps/s | 4.9476 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7708ms | 0.5672ms | 1.7631 KOps/s | 1.7371 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8013ms | 0.5661ms | 1.7664 KOps/s | 1.7653 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7103ms | 0.5149ms | 1.9422 KOps/s | 1.9887 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7012ms | 0.4990ms | 2.0039 KOps/s | 1.9940 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1047ms | 0.6408ms | 1.5605 KOps/s | 1.5450 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8552ms | 0.6410ms | 1.5602 KOps/s | 1.5576 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7536ms | 0.5583ms | 1.7912 KOps/s | 1.6886 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7910ms | 0.5615ms | 1.7808 KOps/s | 1.7233 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.0670ms | 7.5948ms | 131.6694 Ops/s | 129.3257 Ops/s | |
test_vmap_transformer_speed[True-False] | 7.7453ms | 7.4968ms | 133.3897 Ops/s | 129.5591 Ops/s | |
test_vmap_transformer_speed[False-True] | 7.6446ms | 7.4405ms | 134.4004 Ops/s | 130.3973 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.0195ms | 7.5994ms | 131.5900 Ops/s | 130.5478 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.9022ms | 18.5444ms | 53.9245 Ops/s | 52.3234 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.0490ms | 18.6123ms | 53.7280 Ops/s | 52.4073 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.6452ms | 18.3531ms | 54.4867 Ops/s | 52.9137 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.0367ms | 18.3893ms | 54.3796 Ops/s | 52.9993 Ops/s | |
test_to_module_speed[True] | 1.6022ms | 1.4842ms | 673.7843 Ops/s | 655.7122 Ops/s | |
test_to_module_speed[False] | 1.5901ms | 1.4612ms | 684.3782 Ops/s | 662.0518 Ops/s | |
test_tc_init | 54.0410μs | 35.9928μs | 27.7833 KOps/s | 28.1650 KOps/s | |
test_tc_init_nested | 0.1028ms | 71.4880μs | 13.9884 KOps/s | 14.2762 KOps/s | |
test_tc_first_layer_tensor | 18.8800μs | 3.9748μs | 251.5840 KOps/s | 277.3163 KOps/s | |
test_tc_first_layer_nontensor | 0.1664ms | 4.0099μs | 249.3857 KOps/s | 273.6621 KOps/s | |
test_tc_second_layer_tensor | 45.2413μs | 1.2895μs | 775.5228 KOps/s | 811.1156 KOps/s | |
test_tc_second_layer_nontensor | 21.0410μs | 4.5790μs | 218.3893 KOps/s | 240.6797 KOps/s |
vmoens
added a commit
that referenced
this pull request
Jul 15, 2024
ghstack-source-id: ddc0fac60371b3514f4e2e912afabab3c3720bd7 Pull Request resolved: #882
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):