-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Compile - nn compatibility #881
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jul 12, 2024
This was referenced Jul 12, 2024
vmoens
added a commit
that referenced
this pull request
Jul 12, 2024
ghstack-source-id: 6fb01c93298d57ad12f4799488ac52691f722d5f Pull Request resolved: #881
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 42.2990μs | 16.5714μs | 60.3448 KOps/s | 60.5313 KOps/s | |
test_plain_set_stack_nested | 38.4520μs | 16.9089μs | 59.1403 KOps/s | 59.3080 KOps/s | |
test_plain_set_nested_inplace | 65.1110μs | 18.5298μs | 53.9671 KOps/s | 54.0026 KOps/s | |
test_plain_set_stack_nested_inplace | 88.4720μs | 18.3459μs | 54.5080 KOps/s | 54.9210 KOps/s | |
test_items | 37.4000μs | 2.7396μs | 365.0127 KOps/s | 377.4656 KOps/s | |
test_items_nested | 0.5244ms | 0.3640ms | 2.7470 KOps/s | 2.5528 KOps/s | |
test_items_nested_locked | 1.1757ms | 0.3619ms | 2.7629 KOps/s | 2.5637 KOps/s | |
test_items_nested_leaf | 0.1457ms | 83.8413μs | 11.9273 KOps/s | 11.3952 KOps/s | |
test_items_stack_nested | 0.7713ms | 0.3701ms | 2.7018 KOps/s | 2.5589 KOps/s | |
test_items_stack_nested_leaf | 0.3561ms | 88.1216μs | 11.3480 KOps/s | 11.5009 KOps/s | |
test_items_stack_nested_locked | 0.5250ms | 0.3618ms | 2.7642 KOps/s | 2.5372 KOps/s | |
test_keys | 32.9210μs | 3.8507μs | 259.6918 KOps/s | 239.2727 KOps/s | |
test_keys_nested | 0.2844ms | 0.1436ms | 6.9636 KOps/s | 6.9455 KOps/s | |
test_keys_nested_locked | 0.7427ms | 0.1494ms | 6.6919 KOps/s | 6.6365 KOps/s | |
test_keys_nested_leaf | 0.2335ms | 0.1236ms | 8.0927 KOps/s | 8.1201 KOps/s | |
test_keys_stack_nested | 0.2233ms | 0.1436ms | 6.9653 KOps/s | 6.8876 KOps/s | |
test_keys_stack_nested_leaf | 0.4269ms | 0.1246ms | 8.0231 KOps/s | 8.1295 KOps/s | |
test_keys_stack_nested_locked | 0.1989ms | 0.1495ms | 6.6886 KOps/s | 6.6408 KOps/s | |
test_values | 9.7533μs | 1.1576μs | 863.8761 KOps/s | 833.4934 KOps/s | |
test_values_nested | 0.1005ms | 48.6817μs | 20.5416 KOps/s | 20.2637 KOps/s | |
test_values_nested_locked | 95.6280μs | 48.5398μs | 20.6017 KOps/s | 20.4752 KOps/s | |
test_values_nested_leaf | 0.2261ms | 43.5590μs | 22.9574 KOps/s | 22.3650 KOps/s | |
test_values_stack_nested | 0.2074ms | 48.4044μs | 20.6593 KOps/s | 20.3661 KOps/s | |
test_values_stack_nested_leaf | 94.8070μs | 42.8340μs | 23.3459 KOps/s | 22.6150 KOps/s | |
test_values_stack_nested_locked | 97.1210μs | 48.7471μs | 20.5140 KOps/s | 20.1752 KOps/s | |
test_membership | 2.5303μs | 0.7254μs | 1.3785 MOps/s | 1.1056 MOps/s | |
test_membership_nested | 29.4650μs | 2.6303μs | 380.1832 KOps/s | 372.5918 KOps/s | |
test_membership_nested_leaf | 33.8940μs | 2.7118μs | 368.7559 KOps/s | 369.0634 KOps/s | |
test_membership_stacked_nested | 21.2490μs | 2.6126μs | 382.7570 KOps/s | 370.1731 KOps/s | |
test_membership_stacked_nested_leaf | 21.2500μs | 2.6385μs | 379.0081 KOps/s | 367.6561 KOps/s | |
test_membership_nested_last | 36.1670μs | 3.9230μs | 254.9051 KOps/s | 254.1105 KOps/s | |
test_membership_nested_leaf_last | 53.1690μs | 3.8853μs | 257.3834 KOps/s | 254.0577 KOps/s | |
test_membership_stacked_nested_last | 30.7970μs | 3.8880μs | 257.2033 KOps/s | 254.7226 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.2390μs | 3.8827μs | 257.5542 KOps/s | 250.0929 KOps/s | |
test_nested_getleaf | 48.9620μs | 10.9124μs | 91.6386 KOps/s | 89.9109 KOps/s | |
test_nested_get | 51.6070μs | 10.2736μs | 97.3365 KOps/s | 95.0108 KOps/s | |
test_stacked_getleaf | 48.9410μs | 10.8490μs | 92.1745 KOps/s | 89.4645 KOps/s | |
test_stacked_get | 46.5270μs | 10.5087μs | 95.1594 KOps/s | 95.0807 KOps/s | |
test_nested_getitemleaf | 50.9350μs | 11.4212μs | 87.5564 KOps/s | 86.2396 KOps/s | |
test_nested_getitem | 66.8820μs | 10.3518μs | 96.6018 KOps/s | 94.1454 KOps/s | |
test_stacked_getitemleaf | 57.5780μs | 11.4285μs | 87.5008 KOps/s | 87.2832 KOps/s | |
test_stacked_getitem | 47.0180μs | 10.5699μs | 94.6085 KOps/s | 94.4909 KOps/s | |
test_lock_nested | 6.9568ms | 0.4440ms | 2.2523 KOps/s | 2.1836 KOps/s | |
test_lock_stack_nested | 0.6300ms | 0.4083ms | 2.4489 KOps/s | 2.3906 KOps/s | |
test_unlock_nested | 0.7852ms | 0.3558ms | 2.8106 KOps/s | 2.3101 KOps/s | |
test_unlock_stack_nested | 0.4912ms | 0.3220ms | 3.1060 KOps/s | 2.9815 KOps/s | |
test_flatten_speed | 0.5386ms | 0.1051ms | 9.5178 KOps/s | 9.2456 KOps/s | |
test_unflatten_speed | 0.9928ms | 0.4368ms | 2.2892 KOps/s | 2.2230 KOps/s | |
test_common_ops | 6.7217ms | 0.7375ms | 1.3560 KOps/s | 1.3516 KOps/s | |
test_creation | 79.0080μs | 2.2889μs | 436.8912 KOps/s | 426.4835 KOps/s | |
test_creation_empty | 40.9170μs | 10.0421μs | 99.5811 KOps/s | 106.6076 KOps/s | |
test_creation_nested_1 | 48.7910μs | 12.8013μs | 78.1168 KOps/s | 80.3587 KOps/s | |
test_creation_nested_2 | 47.9200μs | 16.7258μs | 59.7879 KOps/s | 61.7806 KOps/s | |
test_clone | 76.7440μs | 12.7451μs | 78.4615 KOps/s | 75.9648 KOps/s | |
test_getitem[int] | 41.5980μs | 11.8097μs | 84.6765 KOps/s | 85.1796 KOps/s | |
test_getitem[slice_int] | 66.3440μs | 23.7867μs | 42.0404 KOps/s | 41.5710 KOps/s | |
test_getitem[range] | 91.0400μs | 44.8589μs | 22.2921 KOps/s | 21.3222 KOps/s | |
test_getitem[tuple] | 58.1390μs | 19.5289μs | 51.2062 KOps/s | 51.4421 KOps/s | |
test_getitem[list] | 0.2229ms | 40.1500μs | 24.9066 KOps/s | 24.6465 KOps/s | |
test_setitem_dim[int] | 77.9050μs | 33.8269μs | 29.5623 KOps/s | 32.4531 KOps/s | |
test_setitem_dim[slice_int] | 0.1188ms | 60.5507μs | 16.5151 KOps/s | 17.0095 KOps/s | |
test_setitem_dim[range] | 0.1398ms | 81.0878μs | 12.3323 KOps/s | 12.9311 KOps/s | |
test_setitem_dim[tuple] | 90.7790μs | 48.5974μs | 20.5772 KOps/s | 21.1346 KOps/s | |
test_setitem | 69.0290μs | 18.7285μs | 53.3944 KOps/s | 52.8399 KOps/s | |
test_set | 86.1310μs | 18.4923μs | 54.0764 KOps/s | 54.3498 KOps/s | |
test_set_shared | 2.1703ms | 0.1664ms | 6.0096 KOps/s | 5.9381 KOps/s | |
test_update | 0.1492ms | 20.8648μs | 47.9276 KOps/s | 49.0355 KOps/s | |
test_update_nested | 0.1347ms | 28.9799μs | 34.5067 KOps/s | 34.3876 KOps/s | |
test_update__nested | 96.4910μs | 25.5225μs | 39.1810 KOps/s | 39.7739 KOps/s | |
test_set_nested | 71.1030μs | 20.1163μs | 49.7109 KOps/s | 49.7462 KOps/s | |
test_set_nested_new | 98.6040μs | 24.6271μs | 40.6057 KOps/s | 40.0569 KOps/s | |
test_select | 0.1560ms | 40.6712μs | 24.5875 KOps/s | 24.3617 KOps/s | |
test_select_nested | 0.1271ms | 60.5657μs | 16.5110 KOps/s | 16.3878 KOps/s | |
test_exclude_nested | 0.1873ms | 81.0471μs | 12.3385 KOps/s | 12.0832 KOps/s | |
test_empty[True] | 0.6374ms | 0.3459ms | 2.8911 KOps/s | 2.8427 KOps/s | |
test_empty[False] | 19.1910μs | 1.2664μs | 789.6235 KOps/s | 752.3049 KOps/s | |
test_unbind_speed | 0.4403ms | 0.2535ms | 3.9446 KOps/s | 3.7876 KOps/s | |
test_unbind_speed_stack0 | 0.5768ms | 0.2646ms | 3.7789 KOps/s | 3.8377 KOps/s | |
test_unbind_speed_stack1 | 79.7280ms | 0.7512ms | 1.3312 KOps/s | 1.3394 KOps/s | |
test_split | 83.0425ms | 1.6402ms | 609.6794 Ops/s | 609.6858 Ops/s | |
test_chunk | 80.1615ms | 1.6386ms | 610.2758 Ops/s | 609.9515 Ops/s | |
test_creation[device0] | 0.1967ms | 94.4395μs | 10.5888 KOps/s | 10.5210 KOps/s | |
test_creation_from_tensor | 3.9791ms | 99.1527μs | 10.0855 KOps/s | 10.1783 KOps/s | |
test_add_one[memmap_tensor0] | 0.2304ms | 5.2869μs | 189.1470 KOps/s | 183.6093 KOps/s | |
test_contiguous[memmap_tensor0] | 15.9200μs | 0.6297μs | 1.5882 MOps/s | 1.5526 MOps/s | |
test_stack[memmap_tensor0] | 41.2970μs | 3.4577μs | 289.2067 KOps/s | 264.5609 KOps/s | |
test_memmaptd_index | 1.0633ms | 0.2601ms | 3.8451 KOps/s | 3.8333 KOps/s | |
test_memmaptd_index_astensor | 0.8057ms | 0.3342ms | 2.9922 KOps/s | 3.0060 KOps/s | |
test_memmaptd_index_op | 0.9433ms | 0.5958ms | 1.6784 KOps/s | 1.7017 KOps/s | |
test_serialize_model | 0.1388s | 0.1240s | 8.0631 Ops/s | 7.4904 Ops/s | |
test_serialize_model_pickle | 0.4488s | 0.3947s | 2.5338 Ops/s | 2.5124 Ops/s | |
test_serialize_weights | 0.1298s | 0.1228s | 8.1427 Ops/s | 7.7653 Ops/s | |
test_serialize_weights_returnearly | 0.1805s | 0.1639s | 6.1030 Ops/s | 6.1632 Ops/s | |
test_serialize_weights_pickle | 0.4505s | 0.3957s | 2.5274 Ops/s | 2.3388 Ops/s | |
test_serialize_weights_filesystem | 0.1503s | 0.1438s | 6.9557 Ops/s | 6.4196 Ops/s | |
test_serialize_model_filesystem | 0.1590s | 0.1485s | 6.7327 Ops/s | 6.6238 Ops/s | |
test_reshape_pytree | 59.8510μs | 25.5246μs | 39.1779 KOps/s | 38.7104 KOps/s | |
test_reshape_td | 0.1080ms | 33.9512μs | 29.4540 KOps/s | 28.6374 KOps/s | |
test_view_pytree | 74.7390μs | 25.3967μs | 39.3751 KOps/s | 39.2349 KOps/s | |
test_view_td | 81.4430μs | 38.5290μs | 25.9545 KOps/s | 25.0444 KOps/s | |
test_unbind_pytree | 80.6610μs | 29.2488μs | 34.1894 KOps/s | 33.6483 KOps/s | |
test_unbind_td | 0.4086ms | 37.7449μs | 26.4937 KOps/s | 25.8399 KOps/s | |
test_split_pytree | 64.3400μs | 29.3070μs | 34.1216 KOps/s | 33.7134 KOps/s | |
test_split_td | 0.1209ms | 39.7172μs | 25.1780 KOps/s | 24.5433 KOps/s | |
test_add_pytree | 90.8200μs | 33.9928μs | 29.4180 KOps/s | 28.4323 KOps/s | |
test_add_td | 0.1569ms | 55.4627μs | 18.0301 KOps/s | 18.1182 KOps/s | |
test_distributed | 0.2706ms | 0.1296ms | 7.7170 KOps/s | 7.6171 KOps/s | |
test_tdmodule | 28.0220μs | 15.7324μs | 63.5632 KOps/s | 58.0843 KOps/s | |
test_tdmodule_dispatch | 54.0310μs | 32.9476μs | 30.3512 KOps/s | 28.4751 KOps/s | |
test_tdseq | 31.2590μs | 17.6600μs | 56.6250 KOps/s | 50.8431 KOps/s | |
test_tdseq_dispatch | 64.3600μs | 37.6192μs | 26.5821 KOps/s | 25.6210 KOps/s | |
test_instantiation_functorch | 2.0790ms | 1.3494ms | 741.0887 Ops/s | 745.8275 Ops/s | |
test_instantiation_td | 1.5099ms | 1.0199ms | 980.5261 Ops/s | 951.0144 Ops/s | |
test_exec_functorch | 0.2633ms | 0.1592ms | 6.2829 KOps/s | 6.1555 KOps/s | |
test_exec_functional_call | 0.3752ms | 0.1469ms | 6.8065 KOps/s | 6.5741 KOps/s | |
test_exec_td | 0.2852ms | 0.1455ms | 6.8751 KOps/s | 6.5419 KOps/s | |
test_exec_td_decorator | 0.2961ms | 0.2294ms | 4.3587 KOps/s | 4.4099 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.6505ms | 0.4794ms | 2.0860 KOps/s | 2.0469 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7866ms | 0.4771ms | 2.0959 KOps/s | 2.0644 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.5079ms | 0.3929ms | 2.5451 KOps/s | 2.4979 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6842ms | 0.3942ms | 2.5369 KOps/s | 2.4760 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1630ms | 0.5828ms | 1.7158 KOps/s | 1.7422 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8084ms | 0.5708ms | 1.7518 KOps/s | 1.7228 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7722ms | 0.4729ms | 2.1145 KOps/s | 2.0846 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7375ms | 0.4715ms | 2.1208 KOps/s | 2.0968 KOps/s | |
test_to_module_speed[True] | 2.3177ms | 1.7796ms | 561.9163 Ops/s | 565.2773 Ops/s | |
test_to_module_speed[False] | 85.0951ms | 1.9254ms | 519.3605 Ops/s | 562.8469 Ops/s | |
test_tc_init | 0.1557ms | 53.9141μs | 18.5480 KOps/s | 20.3480 KOps/s | |
test_tc_init_nested | 0.3461ms | 0.1093ms | 9.1475 KOps/s | 10.0064 KOps/s | |
test_tc_first_layer_tensor | 31.0880μs | 8.3624μs | 119.5829 KOps/s | 117.6354 KOps/s | |
test_tc_first_layer_nontensor | 55.9150μs | 8.2286μs | 121.5275 KOps/s | 117.6042 KOps/s | |
test_tc_second_layer_tensor | 22.9740μs | 2.5078μs | 398.7492 KOps/s | 373.4646 KOps/s | |
test_tc_second_layer_nontensor | 52.0870μs | 9.5345μs | 104.8822 KOps/s | 104.1380 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 30.2810μs | 12.9955μs | 76.9497 KOps/s | 77.0928 KOps/s | |
test_plain_set_stack_nested | 0.1949ms | 13.1047μs | 76.3085 KOps/s | 76.4912 KOps/s | |
test_plain_set_nested_inplace | 39.2110μs | 14.1256μs | 70.7936 KOps/s | 71.6351 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2069ms | 14.0333μs | 71.2593 KOps/s | 71.1847 KOps/s | |
test_items | 0.1867ms | 4.7675μs | 209.7552 KOps/s | 209.8342 KOps/s | |
test_items_nested | 0.4441ms | 0.3906ms | 2.5603 KOps/s | 2.5437 KOps/s | |
test_items_nested_locked | 0.5826ms | 0.3992ms | 2.5049 KOps/s | 2.5376 KOps/s | |
test_items_nested_leaf | 0.2665ms | 90.0719μs | 11.1022 KOps/s | 11.4730 KOps/s | |
test_items_stack_nested | 0.5805ms | 0.3944ms | 2.5355 KOps/s | 2.5012 KOps/s | |
test_items_stack_nested_leaf | 0.2639ms | 84.7326μs | 11.8018 KOps/s | 11.4350 KOps/s | |
test_items_stack_nested_locked | 0.5809ms | 0.3923ms | 2.5488 KOps/s | 2.5147 KOps/s | |
test_keys | 0.1856ms | 4.3738μs | 228.6320 KOps/s | 229.0570 KOps/s | |
test_keys_nested | 96.4410μs | 68.0563μs | 14.6937 KOps/s | 14.6960 KOps/s | |
test_keys_nested_locked | 0.6820ms | 74.5692μs | 13.4104 KOps/s | 13.2059 KOps/s | |
test_keys_nested_leaf | 0.2390ms | 57.9546μs | 17.2549 KOps/s | 16.8269 KOps/s | |
test_keys_stack_nested | 0.2491ms | 67.6459μs | 14.7829 KOps/s | 14.9517 KOps/s | |
test_keys_stack_nested_leaf | 0.2406ms | 59.2998μs | 16.8635 KOps/s | 17.1533 KOps/s | |
test_keys_stack_nested_locked | 0.2544ms | 73.5154μs | 13.6026 KOps/s | 13.4569 KOps/s | |
test_values | 61.6510μs | 1.7648μs | 566.6367 KOps/s | 497.3293 KOps/s | |
test_values_nested | 0.2154ms | 34.6594μs | 28.8522 KOps/s | 29.0397 KOps/s | |
test_values_nested_locked | 0.2155ms | 36.7248μs | 27.2296 KOps/s | 27.3572 KOps/s | |
test_values_nested_leaf | 53.3910μs | 30.6999μs | 32.5734 KOps/s | 32.4935 KOps/s | |
test_values_stack_nested | 0.2199ms | 35.3505μs | 28.2881 KOps/s | 28.2801 KOps/s | |
test_values_stack_nested_leaf | 0.2109ms | 31.5341μs | 31.7117 KOps/s | 31.8114 KOps/s | |
test_values_stack_nested_locked | 0.2254ms | 37.3723μs | 26.7578 KOps/s | 26.6371 KOps/s | |
test_membership | 9.5047μs | 0.5420μs | 1.8449 MOps/s | 1.8585 MOps/s | |
test_membership_nested | 22.9800μs | 2.0610μs | 485.2097 KOps/s | 480.8359 KOps/s | |
test_membership_nested_leaf | 95.6765μs | 2.0338μs | 491.6931 KOps/s | 488.6423 KOps/s | |
test_membership_stacked_nested | 0.1901ms | 2.0947μs | 477.4066 KOps/s | 474.9091 KOps/s | |
test_membership_stacked_nested_leaf | 17.0290μs | 2.0927μs | 477.8410 KOps/s | 486.3279 KOps/s | |
test_membership_nested_last | 18.4700μs | 2.9777μs | 335.8281 KOps/s | 330.0847 KOps/s | |
test_membership_nested_leaf_last | 0.1906ms | 2.9850μs | 335.0115 KOps/s | 330.4788 KOps/s | |
test_membership_stacked_nested_last | 24.1300μs | 9.1703μs | 109.0480 KOps/s | 160.6687 KOps/s | |
test_membership_stacked_nested_leaf_last | 0.1981ms | 9.1624μs | 109.1423 KOps/s | 160.9778 KOps/s | |
test_nested_getleaf | 0.1888ms | 7.9489μs | 125.8029 KOps/s | 124.6604 KOps/s | |
test_nested_get | 0.1879ms | 7.5325μs | 132.7589 KOps/s | 132.1862 KOps/s | |
test_stacked_getleaf | 23.9010μs | 7.9854μs | 125.2280 KOps/s | 123.7568 KOps/s | |
test_stacked_get | 0.1923ms | 7.5052μs | 133.2410 KOps/s | 132.2353 KOps/s | |
test_nested_getitemleaf | 50.4910μs | 8.1602μs | 122.5455 KOps/s | 122.4741 KOps/s | |
test_nested_getitem | 0.1912ms | 7.6724μs | 130.3381 KOps/s | 129.5516 KOps/s | |
test_stacked_getitemleaf | 0.1939ms | 8.1582μs | 122.5763 KOps/s | 122.0586 KOps/s | |
test_stacked_getitem | 24.0510μs | 7.6998μs | 129.8731 KOps/s | 129.5338 KOps/s | |
test_lock_nested | 9.3627ms | 0.4205ms | 2.3783 KOps/s | 2.4361 KOps/s | |
test_lock_stack_nested | 0.4089ms | 0.3721ms | 2.6872 KOps/s | 2.6522 KOps/s | |
test_unlock_nested | 0.7355ms | 0.3289ms | 3.0408 KOps/s | 3.0365 KOps/s | |
test_unlock_stack_nested | 0.3271ms | 0.2913ms | 3.4332 KOps/s | 3.4044 KOps/s | |
test_flatten_speed | 0.4028ms | 0.1068ms | 9.3623 KOps/s | 9.2561 KOps/s | |
test_unflatten_speed | 0.4797ms | 0.2966ms | 3.3713 KOps/s | 3.3839 KOps/s | |
test_common_ops | 1.0028ms | 0.5931ms | 1.6862 KOps/s | 1.5145 KOps/s | |
test_creation | 38.4100μs | 1.8813μs | 531.5476 KOps/s | 539.2590 KOps/s | |
test_creation_empty | 25.2300μs | 9.6727μs | 103.3841 KOps/s | 105.2331 KOps/s | |
test_creation_nested_1 | 0.1998ms | 11.5864μs | 86.3077 KOps/s | 88.4715 KOps/s | |
test_creation_nested_2 | 30.1900μs | 14.0464μs | 71.1924 KOps/s | 72.2935 KOps/s | |
test_clone | 0.1966ms | 11.0120μs | 90.8097 KOps/s | 92.1964 KOps/s | |
test_getitem[int] | 24.1700μs | 10.0299μs | 99.7015 KOps/s | 100.3568 KOps/s | |
test_getitem[slice_int] | 1.7600ms | 19.7815μs | 50.5522 KOps/s | 52.0367 KOps/s | |
test_getitem[range] | 0.1563ms | 36.3065μs | 27.5433 KOps/s | 28.0905 KOps/s | |
test_getitem[tuple] | 0.2082ms | 17.2946μs | 57.8215 KOps/s | 57.6930 KOps/s | |
test_getitem[list] | 0.1658ms | 31.8808μs | 31.3669 KOps/s | 32.3588 KOps/s | |
test_setitem_dim[int] | 45.2310μs | 26.1867μs | 38.1873 KOps/s | 40.3954 KOps/s | |
test_setitem_dim[slice_int] | 63.4010μs | 46.1712μs | 21.6585 KOps/s | 21.5769 KOps/s | |
test_setitem_dim[range] | 81.9910μs | 62.8779μs | 15.9038 KOps/s | 16.0628 KOps/s | |
test_setitem_dim[tuple] | 57.4810μs | 40.5692μs | 24.6493 KOps/s | 24.9634 KOps/s | |
test_setitem | 70.9710μs | 16.2490μs | 61.5421 KOps/s | 59.9863 KOps/s | |
test_set | 63.2520μs | 15.5648μs | 64.2474 KOps/s | 62.3666 KOps/s | |
test_set_shared | 2.7623ms | 95.5984μs | 10.4604 KOps/s | 10.4026 KOps/s | |
test_update | 0.2071ms | 19.1226μs | 52.2941 KOps/s | 50.6924 KOps/s | |
test_update_nested | 62.8010μs | 24.5426μs | 40.7455 KOps/s | 38.6573 KOps/s | |
test_update__nested | 0.2070ms | 21.3920μs | 46.7464 KOps/s | 44.7752 KOps/s | |
test_set_nested | 63.0910μs | 16.8324μs | 59.4093 KOps/s | 56.2662 KOps/s | |
test_set_nested_new | 85.2400μs | 19.7182μs | 50.7146 KOps/s | 48.8617 KOps/s | |
test_select | 1.0005ms | 33.4006μs | 29.9396 KOps/s | 29.9149 KOps/s | |
test_select_nested | 0.2337ms | 52.7633μs | 18.9526 KOps/s | 18.7007 KOps/s | |
test_exclude_nested | 90.1120μs | 72.4295μs | 13.8065 KOps/s | 13.9001 KOps/s | |
test_empty[True] | 0.4842ms | 0.3019ms | 3.3121 KOps/s | 3.3600 KOps/s | |
test_empty[False] | 18.8634μs | 0.9149μs | 1.0930 MOps/s | 1.0898 MOps/s | |
test_to | 88.5420μs | 58.0461μs | 17.2277 KOps/s | 17.3089 KOps/s | |
test_to_nonblocking | 0.2356ms | 35.0986μs | 28.4912 KOps/s | 26.9676 KOps/s | |
test_unbind_speed | 0.3732ms | 0.2492ms | 4.0128 KOps/s | 4.0913 KOps/s | |
test_unbind_speed_stack0 | 0.4355ms | 0.2455ms | 4.0738 KOps/s | 4.0128 KOps/s | |
test_unbind_speed_stack1 | 92.0416ms | 0.7715ms | 1.2962 KOps/s | 1.4169 KOps/s | |
test_split | 90.2440ms | 1.5583ms | 641.7168 Ops/s | 646.0443 Ops/s | |
test_chunk | 1.6230ms | 1.4324ms | 698.1349 Ops/s | 702.4465 Ops/s | |
test_creation[device0] | 0.1275ms | 54.4476μs | 18.3663 KOps/s | 18.6270 KOps/s | |
test_creation_from_tensor | 0.2695ms | 51.7818μs | 19.3118 KOps/s | 19.5748 KOps/s | |
test_add_one[memmap_tensor0] | 0.1396ms | 6.9901μs | 143.0586 KOps/s | 152.8512 KOps/s | |
test_contiguous[memmap_tensor0] | 27.1010μs | 0.5672μs | 1.7629 MOps/s | 1.7385 MOps/s | |
test_stack[memmap_tensor0] | 18.9990μs | 4.4877μs | 222.8335 KOps/s | 222.8703 KOps/s | |
test_memmaptd_index | 1.0852ms | 0.2542ms | 3.9337 KOps/s | 3.4016 KOps/s | |
test_memmaptd_index_astensor | 0.6178ms | 0.3155ms | 3.1700 KOps/s | 3.1907 KOps/s | |
test_memmaptd_index_op | 0.8957ms | 0.6279ms | 1.5925 KOps/s | 1.6622 KOps/s | |
test_serialize_model | 93.4288ms | 90.4592ms | 11.0547 Ops/s | 10.4769 Ops/s | |
test_serialize_model_pickle | 1.3472s | 1.2348s | 0.8098 Ops/s | 0.8063 Ops/s | |
test_serialize_weights | 91.7796ms | 87.8048ms | 11.3889 Ops/s | 9.5971 Ops/s | |
test_serialize_weights_returnearly | 0.1952s | 71.5253ms | 13.9811 Ops/s | 13.1204 Ops/s | |
test_serialize_weights_pickle | 1.4165s | 1.2562s | 0.7961 Ops/s | 0.8007 Ops/s | |
test_reshape_pytree | 82.9420μs | 24.7197μs | 40.4536 KOps/s | 39.1549 KOps/s | |
test_reshape_td | 57.3510μs | 29.6617μs | 33.7135 KOps/s | 34.2681 KOps/s | |
test_view_pytree | 0.1468ms | 24.8171μs | 40.2948 KOps/s | 40.8847 KOps/s | |
test_view_td | 0.1793ms | 37.5534μs | 26.6287 KOps/s | 27.6693 KOps/s | |
test_unbind_pytree | 0.1104ms | 30.3802μs | 32.9162 KOps/s | 33.0676 KOps/s | |
test_unbind_td | 0.4587ms | 38.0688μs | 26.2682 KOps/s | 26.9021 KOps/s | |
test_split_pytree | 59.0110μs | 33.4222μs | 29.9202 KOps/s | 30.3958 KOps/s | |
test_split_td | 0.1016ms | 36.4783μs | 27.4136 KOps/s | 27.7165 KOps/s | |
test_add_pytree | 72.1010μs | 37.5560μs | 26.6269 KOps/s | 27.0692 KOps/s | |
test_add_td | 84.1720μs | 51.6076μs | 19.3770 KOps/s | 20.5606 KOps/s | |
test_distributed | 0.1831ms | 67.5376μs | 14.8066 KOps/s | 14.0259 KOps/s | |
test_tdmodule | 29.8910μs | 14.2207μs | 70.3198 KOps/s | 64.0265 KOps/s | |
test_tdmodule_dispatch | 44.5510μs | 29.2572μs | 34.1796 KOps/s | 32.3532 KOps/s | |
test_tdseq | 43.8910μs | 15.3442μs | 65.1712 KOps/s | 57.8969 KOps/s | |
test_tdseq_dispatch | 60.4610μs | 31.6409μs | 31.6047 KOps/s | 30.1952 KOps/s | |
test_instantiation_functorch | 1.4223ms | 1.3730ms | 728.3323 Ops/s | 727.9760 Ops/s | |
test_instantiation_td | 92.2051ms | 1.0859ms | 920.8749 Ops/s | 1.0257 KOps/s | |
test_exec_functorch | 0.1767ms | 0.1477ms | 6.7711 KOps/s | 7.0172 KOps/s | |
test_exec_functional_call | 0.1780ms | 0.1342ms | 7.4533 KOps/s | 7.6033 KOps/s | |
test_exec_td | 0.1745ms | 0.1336ms | 7.4825 KOps/s | 7.6604 KOps/s | |
test_exec_td_decorator | 0.6946ms | 0.2084ms | 4.7980 KOps/s | 5.0159 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7521ms | 0.5844ms | 1.7112 KOps/s | 1.6689 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.6240ms | 0.5820ms | 1.7181 KOps/s | 1.7064 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.5966ms | 0.5156ms | 1.9396 KOps/s | 1.8812 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5726ms | 0.5169ms | 1.9348 KOps/s | 1.8822 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0078ms | 0.6609ms | 1.5131 KOps/s | 1.5308 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7890ms | 0.6585ms | 1.5187 KOps/s | 1.5207 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6678ms | 0.5795ms | 1.7255 KOps/s | 1.7422 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7278ms | 0.5792ms | 1.7264 KOps/s | 1.6452 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.4463ms | 7.9718ms | 125.4424 Ops/s | 130.3081 Ops/s | |
test_vmap_transformer_speed[True-False] | 9.0218ms | 7.8267ms | 127.7673 Ops/s | 130.8116 Ops/s | |
test_vmap_transformer_speed[False-True] | 7.6654ms | 7.6175ms | 131.2767 Ops/s | 131.7592 Ops/s | |
test_vmap_transformer_speed[False-False] | 7.6772ms | 7.6149ms | 131.3220 Ops/s | 132.3523 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.0743ms | 19.0137ms | 52.5936 Ops/s | 53.0149 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.0437ms | 18.9819ms | 52.6819 Ops/s | 52.9195 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.9118ms | 18.8577ms | 53.0288 Ops/s | 53.4867 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.9242ms | 18.8354ms | 53.0914 Ops/s | 53.4910 Ops/s | |
test_to_module_speed[True] | 2.8696ms | 1.5682ms | 637.6643 Ops/s | 672.0408 Ops/s | |
test_to_module_speed[False] | 2.0094ms | 1.5477ms | 646.1331 Ops/s | 679.6448 Ops/s | |
test_tc_init | 75.4920μs | 53.5653μs | 18.6688 KOps/s | 18.3770 KOps/s | |
test_tc_init_nested | 0.1389ms | 0.1052ms | 9.5069 KOps/s | 9.1469 KOps/s | |
test_tc_first_layer_tensor | 19.4800μs | 3.5279μs | 283.4542 KOps/s | 284.5060 KOps/s | |
test_tc_first_layer_nontensor | 17.0900μs | 3.5719μs | 279.9642 KOps/s | 283.0654 KOps/s | |
test_tc_second_layer_tensor | 16.9200μs | 1.2123μs | 824.8468 KOps/s | 905.6879 KOps/s | |
test_tc_second_layer_nontensor | 19.8900μs | 4.2118μs | 237.4301 KOps/s | 248.0658 KOps/s |
_lock_parents_weakrefs = [] | ||
self._lock_parents_weakrefs += _lock_parents_weakrefs | ||
_lock_parents_weakrefs.append(weakref.ref(self)) | ||
if not not is_compiling: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra not
?
def _check_inbuild(): | ||
if not strtobool(os.environ.get("TORCHDYNAMO_INLINE_INBUILT_NN_MODULES", "0")): | ||
raise RuntimeError( | ||
"to_module requires TORCHDYNAMO_INLINE_INBUILT_NN_MODULES to be set." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you share any pointers to why this is needed ?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):