Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Remove _run_checks from __init__ #843

Merged
merged 1 commit into from
Jun 30, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 30, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 30, 2024
@vmoens vmoens added the Refactor Refactoring code - not a new feature label Jun 30, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}24$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 35.5760μs 16.2917μs 61.3808 KOps/s 63.9503 KOps/s $\color{#d91a1a}-4.02\%$
test_plain_set_stack_nested 59.0810μs 16.7796μs 59.5963 KOps/s 63.5154 KOps/s $\textbf{\color{#d91a1a}-6.17\%}$
test_plain_set_nested_inplace 93.6460μs 18.7067μs 53.4567 KOps/s 56.1229 KOps/s $\color{#d91a1a}-4.75\%$
test_plain_set_stack_nested_inplace 0.3306ms 19.2878μs 51.8461 KOps/s 56.4593 KOps/s $\textbf{\color{#d91a1a}-8.17\%}$
test_items 0.1649ms 2.7109μs 368.8748 KOps/s 373.4862 KOps/s $\color{#d91a1a}-1.23\%$
test_items_nested 0.4403ms 0.2750ms 3.6363 KOps/s 3.6405 KOps/s $\color{#d91a1a}-0.11\%$
test_items_nested_locked 1.0292ms 0.2778ms 3.5996 KOps/s 3.6124 KOps/s $\color{#d91a1a}-0.35\%$
test_items_nested_leaf 0.1509ms 80.0285μs 12.4956 KOps/s 12.6756 KOps/s $\color{#d91a1a}-1.42\%$
test_items_stack_nested 1.4119ms 0.2824ms 3.5414 KOps/s 3.5765 KOps/s $\color{#d91a1a}-0.98\%$
test_items_stack_nested_leaf 0.1638ms 81.8633μs 12.2155 KOps/s 12.6613 KOps/s $\color{#d91a1a}-3.52\%$
test_items_stack_nested_locked 0.4597ms 0.2782ms 3.5951 KOps/s 3.5815 KOps/s $\color{#35bf28}+0.38\%$
test_keys 28.7240μs 3.8321μs 260.9502 KOps/s 258.7869 KOps/s $\color{#35bf28}+0.84\%$
test_keys_nested 0.2771ms 0.1400ms 7.1404 KOps/s 7.1877 KOps/s $\color{#d91a1a}-0.66\%$
test_keys_nested_locked 0.7709ms 0.1443ms 6.9294 KOps/s 6.8731 KOps/s $\color{#35bf28}+0.82\%$
test_keys_nested_leaf 0.2440ms 0.1200ms 8.3326 KOps/s 8.4673 KOps/s $\color{#d91a1a}-1.59\%$
test_keys_stack_nested 0.2667ms 0.1389ms 7.1988 KOps/s 7.2350 KOps/s $\color{#d91a1a}-0.50\%$
test_keys_stack_nested_leaf 0.2346ms 0.1168ms 8.5624 KOps/s 8.5591 KOps/s $\color{#35bf28}+0.04\%$
test_keys_stack_nested_locked 0.2759ms 0.1430ms 6.9941 KOps/s 7.0006 KOps/s $\color{#d91a1a}-0.09\%$
test_values 6.0592μs 1.2190μs 820.3720 KOps/s 878.0777 KOps/s $\textbf{\color{#d91a1a}-6.57\%}$
test_values_nested 91.1310μs 50.3862μs 19.8467 KOps/s 19.6291 KOps/s $\color{#35bf28}+1.11\%$
test_values_nested_locked 0.1043ms 50.0045μs 19.9982 KOps/s 19.6537 KOps/s $\color{#35bf28}+1.75\%$
test_values_nested_leaf 1.7468ms 45.9350μs 21.7699 KOps/s 21.9416 KOps/s $\color{#d91a1a}-0.78\%$
test_values_stack_nested 0.1025ms 50.9564μs 19.6246 KOps/s 19.2771 KOps/s $\color{#35bf28}+1.80\%$
test_values_stack_nested_leaf 95.3390μs 45.5471μs 21.9553 KOps/s 21.8465 KOps/s $\color{#35bf28}+0.50\%$
test_values_stack_nested_locked 97.6340μs 50.7818μs 19.6921 KOps/s 19.2052 KOps/s $\color{#35bf28}+2.54\%$
test_membership 33.4430μs 1.3813μs 723.9653 KOps/s 740.5928 KOps/s $\color{#d91a1a}-2.25\%$
test_membership_nested 29.5050μs 3.5286μs 283.3983 KOps/s 291.7891 KOps/s $\color{#d91a1a}-2.88\%$
test_membership_nested_leaf 28.9140μs 3.5447μs 282.1112 KOps/s 286.1673 KOps/s $\color{#d91a1a}-1.42\%$
test_membership_stacked_nested 27.7320μs 3.5246μs 283.7168 KOps/s 293.2549 KOps/s $\color{#d91a1a}-3.25\%$
test_membership_stacked_nested_leaf 22.6130μs 3.5330μs 283.0435 KOps/s 286.5922 KOps/s $\color{#d91a1a}-1.24\%$
test_membership_nested_last 29.3550μs 4.3000μs 232.5555 KOps/s 235.0121 KOps/s $\color{#d91a1a}-1.05\%$
test_membership_nested_leaf_last 29.9560μs 4.3222μs 231.3618 KOps/s 237.0276 KOps/s $\color{#d91a1a}-2.39\%$
test_membership_stacked_nested_last 40.7170μs 4.3334μs 230.7647 KOps/s 237.8379 KOps/s $\color{#d91a1a}-2.97\%$
test_membership_stacked_nested_leaf_last 26.9710μs 4.2883μs 233.1944 KOps/s 236.7567 KOps/s $\color{#d91a1a}-1.50\%$
test_nested_getleaf 47.1080μs 10.5740μs 94.5715 KOps/s 92.3728 KOps/s $\color{#35bf28}+2.38\%$
test_nested_get 49.1520μs 10.0336μs 99.6655 KOps/s 97.6651 KOps/s $\color{#35bf28}+2.05\%$
test_stacked_getleaf 39.3740μs 10.6065μs 94.2815 KOps/s 93.3179 KOps/s $\color{#35bf28}+1.03\%$
test_stacked_get 34.5750μs 9.9907μs 100.0927 KOps/s 99.8927 KOps/s $\color{#35bf28}+0.20\%$
test_nested_getitemleaf 57.1870μs 11.1590μs 89.6136 KOps/s 88.2549 KOps/s $\color{#35bf28}+1.54\%$
test_nested_getitem 52.6590μs 10.1768μs 98.2625 KOps/s 96.1874 KOps/s $\color{#35bf28}+2.16\%$
test_stacked_getitemleaf 65.0920μs 11.1400μs 89.7663 KOps/s 89.0873 KOps/s $\color{#35bf28}+0.76\%$
test_stacked_getitem 36.9590μs 10.2409μs 97.6478 KOps/s 97.5005 KOps/s $\color{#35bf28}+0.15\%$
test_lock_nested 0.8744ms 0.3360ms 2.9761 KOps/s 2.9233 KOps/s $\color{#35bf28}+1.80\%$
test_lock_stack_nested 0.5632ms 0.3038ms 3.2921 KOps/s 3.2679 KOps/s $\color{#35bf28}+0.74\%$
test_unlock_nested 0.7513ms 0.3366ms 2.9708 KOps/s 2.8479 KOps/s $\color{#35bf28}+4.32\%$
test_unlock_stack_nested 0.4705ms 0.3092ms 3.2346 KOps/s 3.1761 KOps/s $\color{#35bf28}+1.84\%$
test_flatten_speed 0.5965ms 99.5208μs 10.0482 KOps/s 10.0088 KOps/s $\color{#35bf28}+0.39\%$
test_unflatten_speed 0.8993ms 0.4178ms 2.3933 KOps/s 2.3634 KOps/s $\color{#35bf28}+1.27\%$
test_common_ops 3.0222ms 0.7256ms 1.3781 KOps/s 1.4845 KOps/s $\textbf{\color{#d91a1a}-7.17\%}$
test_creation 32.9520μs 1.9522μs 512.2505 KOps/s 509.6016 KOps/s $\color{#35bf28}+0.52\%$
test_creation_empty 31.4790μs 10.0522μs 99.4811 KOps/s 122.3387 KOps/s $\textbf{\color{#d91a1a}-18.68\%}$
test_creation_nested_1 39.0840μs 12.9283μs 77.3499 KOps/s 92.3064 KOps/s $\textbf{\color{#d91a1a}-16.20\%}$
test_creation_nested_2 51.9570μs 16.3381μs 61.2065 KOps/s 70.5053 KOps/s $\textbf{\color{#d91a1a}-13.19\%}$
test_clone 0.1042ms 13.2032μs 75.7391 KOps/s 74.2141 KOps/s $\color{#35bf28}+2.05\%$
test_getitem[int] 34.8450μs 11.1903μs 89.3630 KOps/s 86.8984 KOps/s $\color{#35bf28}+2.84\%$
test_getitem[slice_int] 70.0010μs 22.5618μs 44.3227 KOps/s 44.0908 KOps/s $\color{#35bf28}+0.53\%$
test_getitem[range] 76.3730μs 57.0086μs 17.5412 KOps/s 16.8471 KOps/s $\color{#35bf28}+4.12\%$
test_getitem[tuple] 55.3340μs 18.5531μs 53.8994 KOps/s 52.8615 KOps/s $\color{#35bf28}+1.96\%$
test_getitem[list] 0.1028ms 40.2801μs 24.8261 KOps/s 25.2007 KOps/s $\color{#d91a1a}-1.49\%$
test_setitem_dim[int] 55.1930μs 33.8692μs 29.5253 KOps/s 31.7981 KOps/s $\textbf{\color{#d91a1a}-7.15\%}$
test_setitem_dim[slice_int] 0.1043ms 61.1811μs 16.3449 KOps/s 17.2482 KOps/s $\textbf{\color{#d91a1a}-5.24\%}$
test_setitem_dim[range] 0.1281ms 84.8939μs 11.7794 KOps/s 12.5689 KOps/s $\textbf{\color{#d91a1a}-6.28\%}$
test_setitem_dim[tuple] 83.2160μs 50.3551μs 19.8590 KOps/s 21.5231 KOps/s $\textbf{\color{#d91a1a}-7.73\%}$
test_setitem 50.7060μs 19.9359μs 50.1607 KOps/s 52.1624 KOps/s $\color{#d91a1a}-3.84\%$
test_set 70.9210μs 19.1040μs 52.3451 KOps/s 55.1652 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_set_shared 1.4511ms 0.1443ms 6.9305 KOps/s 6.8912 KOps/s $\color{#35bf28}+0.57\%$
test_update 0.1289ms 22.0608μs 45.3293 KOps/s 51.0009 KOps/s $\textbf{\color{#d91a1a}-11.12\%}$
test_update_nested 78.5780μs 30.8919μs 32.3709 KOps/s 35.9814 KOps/s $\textbf{\color{#d91a1a}-10.03\%}$
test_update__nested 68.7890μs 24.7133μs 40.4640 KOps/s 40.1429 KOps/s $\color{#35bf28}+0.80\%$
test_set_nested 0.1008ms 21.4961μs 46.5201 KOps/s 50.3723 KOps/s $\textbf{\color{#d91a1a}-7.65\%}$
test_set_nested_new 67.1760μs 25.1895μs 39.6990 KOps/s 41.2566 KOps/s $\color{#d91a1a}-3.78\%$
test_select 93.1950μs 40.6558μs 24.5967 KOps/s 24.7427 KOps/s $\color{#d91a1a}-0.59\%$
test_select_nested 0.1167ms 57.5285μs 17.3827 KOps/s 16.3817 KOps/s $\textbf{\color{#35bf28}+6.11\%}$
test_exclude_nested 0.2210ms 0.1178ms 8.4872 KOps/s 8.2005 KOps/s $\color{#35bf28}+3.50\%$
test_empty[True] 0.5513ms 0.3951ms 2.5313 KOps/s 2.4592 KOps/s $\color{#35bf28}+2.93\%$
test_empty[False] 7.4360μs 1.0849μs 921.7156 KOps/s 858.2779 KOps/s $\textbf{\color{#35bf28}+7.39\%}$
test_unbind_speed 1.6962ms 0.2431ms 4.1143 KOps/s 3.8515 KOps/s $\textbf{\color{#35bf28}+6.83\%}$
test_unbind_speed_stack0 0.4567ms 0.2484ms 4.0262 KOps/s 4.0142 KOps/s $\color{#35bf28}+0.30\%$
test_unbind_speed_stack1 64.8664ms 0.7116ms 1.4053 KOps/s 1.3939 KOps/s $\color{#35bf28}+0.82\%$
test_split 66.3857ms 1.5878ms 629.7952 Ops/s 610.1197 Ops/s $\color{#35bf28}+3.22\%$
test_chunk 70.5214ms 1.6292ms 613.7904 Ops/s 596.5007 Ops/s $\color{#35bf28}+2.90\%$
test_creation[device0] 0.1812ms 85.4147μs 11.7076 KOps/s 11.4864 KOps/s $\color{#35bf28}+1.93\%$
test_creation_from_tensor 2.9725ms 88.1753μs 11.3410 KOps/s 11.4962 KOps/s $\color{#d91a1a}-1.35\%$
test_add_one[memmap_tensor0] 68.3180μs 5.4127μs 184.7519 KOps/s 180.4886 KOps/s $\color{#35bf28}+2.36\%$
test_contiguous[memmap_tensor0] 14.1470μs 0.6406μs 1.5610 MOps/s 1.5504 MOps/s $\color{#35bf28}+0.68\%$
test_stack[memmap_tensor0] 24.6460μs 3.5230μs 283.8459 KOps/s 260.5583 KOps/s $\textbf{\color{#35bf28}+8.94\%}$
test_memmaptd_index 0.5962ms 0.2625ms 3.8090 KOps/s 3.8360 KOps/s $\color{#d91a1a}-0.70\%$
test_memmaptd_index_astensor 0.5682ms 0.3296ms 3.0342 KOps/s 2.9661 KOps/s $\color{#35bf28}+2.30\%$
test_memmaptd_index_op 2.5382ms 0.6081ms 1.6444 KOps/s 1.7261 KOps/s $\color{#d91a1a}-4.74\%$
test_serialize_model 0.1670s 0.1041s 9.6071 Ops/s 9.0795 Ops/s $\textbf{\color{#35bf28}+5.81\%}$
test_serialize_model_pickle 0.4498s 0.3763s 2.6573 Ops/s 2.5873 Ops/s $\color{#35bf28}+2.71\%$
test_serialize_weights 0.1638s 0.1044s 9.5764 Ops/s 9.4960 Ops/s $\color{#35bf28}+0.85\%$
test_serialize_weights_returnearly 0.1342s 0.1191s 8.3943 Ops/s 8.4030 Ops/s $\color{#d91a1a}-0.10\%$
test_serialize_weights_pickle 0.8568s 0.5085s 1.9665 Ops/s 2.4241 Ops/s $\textbf{\color{#d91a1a}-18.88\%}$
test_serialize_weights_filesystem 0.1564s 0.1015s 9.8531 Ops/s 9.7564 Ops/s $\color{#35bf28}+0.99\%$
test_serialize_model_filesystem 98.2677ms 93.2379ms 10.7253 Ops/s 10.1973 Ops/s $\textbf{\color{#35bf28}+5.18\%}$
test_reshape_pytree 57.5080μs 25.6521μs 38.9831 KOps/s 38.9806 KOps/s $+0.01\%$
test_reshape_td 96.7820μs 34.0252μs 29.3900 KOps/s 29.3485 KOps/s $\color{#35bf28}+0.14\%$
test_view_pytree 61.8260μs 25.6221μs 39.0288 KOps/s 39.1964 KOps/s $\color{#d91a1a}-0.43\%$
test_view_td 82.3240μs 38.5033μs 25.9718 KOps/s 26.2880 KOps/s $\color{#d91a1a}-1.20\%$
test_unbind_pytree 70.0120μs 29.5474μs 33.8439 KOps/s 33.5525 KOps/s $\color{#35bf28}+0.87\%$
test_unbind_td 0.3834ms 36.8136μs 27.1639 KOps/s 26.1077 KOps/s $\color{#35bf28}+4.05\%$
test_split_pytree 65.0820μs 29.5559μs 33.8342 KOps/s 34.0460 KOps/s $\color{#d91a1a}-0.62\%$
test_split_td 0.1216ms 40.3557μs 24.7797 KOps/s 24.3652 KOps/s $\color{#35bf28}+1.70\%$
test_add_pytree 83.2960μs 35.2641μs 28.3574 KOps/s 28.4349 KOps/s $\color{#d91a1a}-0.27\%$
test_add_td 0.1171ms 54.0301μs 18.5082 KOps/s 20.4571 KOps/s $\textbf{\color{#d91a1a}-9.53\%}$
test_distributed 0.2382ms 0.1022ms 9.7812 KOps/s 9.7361 KOps/s $\color{#35bf28}+0.46\%$
test_tdmodule 41.0870μs 18.2013μs 54.9411 KOps/s 59.1108 KOps/s $\textbf{\color{#d91a1a}-7.05\%}$
test_tdmodule_dispatch 60.7040μs 35.3090μs 28.3214 KOps/s 29.4347 KOps/s $\color{#d91a1a}-3.78\%$
test_tdseq 45.2850μs 20.5470μs 48.6689 KOps/s 51.2473 KOps/s $\textbf{\color{#d91a1a}-5.03\%}$
test_tdseq_dispatch 69.4000μs 40.2404μs 24.8507 KOps/s 26.4869 KOps/s $\textbf{\color{#d91a1a}-6.18\%}$
test_instantiation_functorch 1.5390ms 1.3339ms 749.6884 Ops/s 743.0078 Ops/s $\color{#35bf28}+0.90\%$
test_instantiation_td 68.1281ms 1.1283ms 886.3158 Ops/s 908.4473 Ops/s $\color{#d91a1a}-2.44\%$
test_exec_functorch 0.3010ms 0.1597ms 6.2623 KOps/s 6.2356 KOps/s $\color{#35bf28}+0.43\%$
test_exec_functional_call 0.2787ms 0.1475ms 6.7812 KOps/s 6.8602 KOps/s $\color{#d91a1a}-1.15\%$
test_exec_td 0.2860ms 0.1448ms 6.9078 KOps/s 7.0944 KOps/s $\color{#d91a1a}-2.63\%$
test_exec_td_decorator 0.7266ms 0.2207ms 4.5300 KOps/s 4.5784 KOps/s $\color{#d91a1a}-1.06\%$
test_vmap_mlp_speed[True-True] 0.7320ms 0.4908ms 2.0375 KOps/s 2.0425 KOps/s $\color{#d91a1a}-0.25\%$
test_vmap_mlp_speed[True-False] 0.6730ms 0.4863ms 2.0564 KOps/s 2.0740 KOps/s $\color{#d91a1a}-0.85\%$
test_vmap_mlp_speed[False-True] 0.6884ms 0.3973ms 2.5170 KOps/s 2.5135 KOps/s $\color{#35bf28}+0.14\%$
test_vmap_mlp_speed[False-False] 0.7819ms 0.4005ms 2.4968 KOps/s 2.4881 KOps/s $\color{#35bf28}+0.35\%$
test_vmap_mlp_speed_decorator[True-True] 1.2159ms 0.5629ms 1.7764 KOps/s 1.7876 KOps/s $\color{#d91a1a}-0.62\%$
test_vmap_mlp_speed_decorator[True-False] 2.4098ms 0.5728ms 1.7459 KOps/s 1.7998 KOps/s $\color{#d91a1a}-2.99\%$
test_vmap_mlp_speed_decorator[False-True] 0.7151ms 0.4616ms 2.1666 KOps/s 2.1712 KOps/s $\color{#d91a1a}-0.21\%$
test_vmap_mlp_speed_decorator[False-False] 0.6919ms 0.4604ms 2.1718 KOps/s 2.1649 KOps/s $\color{#35bf28}+0.32\%$
test_to_module_speed[True] 2.5182ms 1.6523ms 605.2281 Ops/s 588.8556 Ops/s $\color{#35bf28}+2.78\%$
test_to_module_speed[False] 3.1248ms 1.6307ms 613.2331 Ops/s 599.3089 Ops/s $\color{#35bf28}+2.32\%$
test_tc_init 61.5850μs 28.2471μs 35.4019 KOps/s 42.8330 KOps/s $\textbf{\color{#d91a1a}-17.35\%}$
test_tc_init_nested 0.1370ms 56.0339μs 17.8463 KOps/s 19.7570 KOps/s $\textbf{\color{#d91a1a}-9.67\%}$
test_tc_first_layer_tensor 2.0083μs 0.6796μs 1.4714 MOps/s 1.3826 MOps/s $\textbf{\color{#35bf28}+6.42\%}$
test_tc_first_layer_nontensor 2.3050μs 0.6790μs 1.4727 MOps/s 1.4485 MOps/s $\color{#35bf28}+1.67\%$
test_tc_second_layer_tensor 24.0150μs 1.8164μs 550.5351 KOps/s 527.5405 KOps/s $\color{#35bf28}+4.36\%$
test_tc_second_layer_nontensor 14.8277μs 1.5248μs 655.8179 KOps/s 641.2365 KOps/s $\color{#35bf28}+2.27\%$
test_unbind 81.8774ms 7.3272ms 136.4781 Ops/s 142.3885 Ops/s $\color{#d91a1a}-4.15\%$
test_full_like 17.3456ms 10.1812ms 98.2200 Ops/s 135.6269 Ops/s $\textbf{\color{#d91a1a}-27.58\%}$
test_zeros_like 11.3164ms 5.8201ms 171.8188 Ops/s 185.6708 Ops/s $\textbf{\color{#d91a1a}-7.46\%}$
test_ones_like 13.7723ms 6.2463ms 160.0959 Ops/s 161.2244 Ops/s $\color{#d91a1a}-0.70\%$
test_clone 12.0721ms 7.7127ms 129.6561 Ops/s 129.7747 Ops/s $\color{#d91a1a}-0.09\%$
test_squeeze 65.2820μs 14.1348μs 70.7473 KOps/s 71.1446 KOps/s $\color{#d91a1a}-0.56\%$
test_unsqueeze 0.2221ms 60.5870μs 16.5052 KOps/s 16.6542 KOps/s $\color{#d91a1a}-0.89\%$
test_split 0.2076ms 0.1125ms 8.8916 KOps/s 8.8995 KOps/s $\color{#d91a1a}-0.09\%$
test_permute 0.2177ms 0.1291ms 7.7465 KOps/s 7.7891 KOps/s $\color{#d91a1a}-0.55\%$
test_stack 27.0478ms 21.9701ms 45.5163 Ops/s 44.9185 Ops/s $\color{#35bf28}+1.33\%$
test_cat 29.6998ms 21.9525ms 45.5529 Ops/s 44.5830 Ops/s $\color{#35bf28}+2.18\%$

@vmoens vmoens merged commit ceb15f1 into main Jun 30, 2024
37 of 42 checks passed
@vmoens vmoens deleted the remove_run_checks branch June 30, 2024 20:09
Copy link

github-actions bot commented Jul 1, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}25$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1054ms 12.3205μs 81.1654 KOps/s 84.4475 KOps/s $\color{#d91a1a}-3.89\%$
test_plain_set_stack_nested 32.5310μs 12.4415μs 80.3764 KOps/s 83.4680 KOps/s $\color{#d91a1a}-3.70\%$
test_plain_set_nested_inplace 0.1192ms 13.8102μs 72.4105 KOps/s 75.4174 KOps/s $\color{#d91a1a}-3.99\%$
test_plain_set_stack_nested_inplace 31.6410μs 13.8545μs 72.1788 KOps/s 75.8325 KOps/s $\color{#d91a1a}-4.82\%$
test_items 24.9010μs 4.5824μs 218.2274 KOps/s 210.2807 KOps/s $\color{#35bf28}+3.78\%$
test_items_nested 0.3754ms 0.3423ms 2.9210 KOps/s 2.9504 KOps/s $\color{#d91a1a}-1.00\%$
test_items_nested_locked 0.3927ms 0.3521ms 2.8398 KOps/s 2.9250 KOps/s $\color{#d91a1a}-2.91\%$
test_items_nested_leaf 95.8120μs 82.1363μs 12.1749 KOps/s 12.1853 KOps/s $\color{#d91a1a}-0.09\%$
test_items_stack_nested 0.4363ms 0.3470ms 2.8822 KOps/s 2.9404 KOps/s $\color{#d91a1a}-1.98\%$
test_items_stack_nested_leaf 98.9310μs 83.0537μs 12.0404 KOps/s 12.1986 KOps/s $\color{#d91a1a}-1.30\%$
test_items_stack_nested_locked 0.4445ms 0.3492ms 2.8639 KOps/s 2.9153 KOps/s $\color{#d91a1a}-1.76\%$
test_keys 20.3510μs 4.3356μs 230.6472 KOps/s 230.8098 KOps/s $\color{#d91a1a}-0.07\%$
test_keys_nested 84.7510μs 69.1943μs 14.4521 KOps/s 14.8144 KOps/s $\color{#d91a1a}-2.45\%$
test_keys_nested_locked 0.8360ms 74.7819μs 13.3722 KOps/s 13.7540 KOps/s $\color{#d91a1a}-2.78\%$
test_keys_nested_leaf 74.1310μs 60.1818μs 16.6163 KOps/s 16.8741 KOps/s $\color{#d91a1a}-1.53\%$
test_keys_stack_nested 0.1879ms 68.9785μs 14.4973 KOps/s 15.2547 KOps/s $\color{#d91a1a}-4.97\%$
test_keys_stack_nested_leaf 80.2720μs 57.6020μs 17.3605 KOps/s 17.5925 KOps/s $\color{#d91a1a}-1.32\%$
test_keys_stack_nested_locked 95.2120μs 74.1113μs 13.4932 KOps/s 13.8143 KOps/s $\color{#d91a1a}-2.32\%$
test_values 6.3600μs 1.7990μs 555.8573 KOps/s 556.8301 KOps/s $\color{#d91a1a}-0.17\%$
test_values_nested 0.1426ms 35.7366μs 27.9825 KOps/s 28.6165 KOps/s $\color{#d91a1a}-2.22\%$
test_values_nested_locked 58.8110μs 37.0613μs 26.9823 KOps/s 27.2928 KOps/s $\color{#d91a1a}-1.14\%$
test_values_nested_leaf 48.3410μs 31.5166μs 31.7293 KOps/s 32.2533 KOps/s $\color{#d91a1a}-1.62\%$
test_values_stack_nested 0.2065ms 35.5779μs 28.1073 KOps/s 28.1334 KOps/s $\color{#d91a1a}-0.09\%$
test_values_stack_nested_leaf 0.2263ms 32.1321μs 31.1215 KOps/s 32.0012 KOps/s $\color{#d91a1a}-2.75\%$
test_values_stack_nested_locked 0.1180ms 37.3646μs 26.7633 KOps/s 26.7385 KOps/s $\color{#35bf28}+0.09\%$
test_membership 3.0343μs 0.7381μs 1.3548 MOps/s 1.4097 MOps/s $\color{#d91a1a}-3.90\%$
test_membership_nested 17.6200μs 2.5537μs 391.5899 KOps/s 385.3331 KOps/s $\color{#35bf28}+1.62\%$
test_membership_nested_leaf 19.2320μs 2.5980μs 384.9045 KOps/s 385.5847 KOps/s $\color{#d91a1a}-0.18\%$
test_membership_stacked_nested 25.3210μs 2.5604μs 390.5703 KOps/s 382.1911 KOps/s $\color{#35bf28}+2.19\%$
test_membership_stacked_nested_leaf 19.3400μs 2.5558μs 391.2672 KOps/s 389.4248 KOps/s $\color{#35bf28}+0.47\%$
test_membership_nested_last 95.4110μs 3.1159μs 320.9299 KOps/s 321.4702 KOps/s $\color{#d91a1a}-0.17\%$
test_membership_nested_leaf_last 20.4110μs 3.1005μs 322.5235 KOps/s 318.1129 KOps/s $\color{#35bf28}+1.39\%$
test_membership_stacked_nested_last 30.3520μs 3.1118μs 321.3576 KOps/s 102.2696 KOps/s $\textbf{\color{#35bf28}+214.23\%}$
test_membership_stacked_nested_leaf_last 0.1427ms 3.1397μs 318.5023 KOps/s 101.4019 KOps/s $\textbf{\color{#35bf28}+214.10\%}$
test_nested_getleaf 25.0500μs 8.4351μs 118.5518 KOps/s 120.1229 KOps/s $\color{#d91a1a}-1.31\%$
test_nested_get 0.1634ms 7.8723μs 127.0272 KOps/s 128.1837 KOps/s $\color{#d91a1a}-0.90\%$
test_stacked_getleaf 0.1876ms 8.4484μs 118.3657 KOps/s 119.3448 KOps/s $\color{#d91a1a}-0.82\%$
test_stacked_get 0.1870ms 7.9020μs 126.5498 KOps/s 127.5033 KOps/s $\color{#d91a1a}-0.75\%$
test_nested_getitemleaf 0.1943ms 8.6712μs 115.3248 KOps/s 116.7531 KOps/s $\color{#d91a1a}-1.22\%$
test_nested_getitem 0.1841ms 8.0739μs 123.8556 KOps/s 124.7826 KOps/s $\color{#d91a1a}-0.74\%$
test_stacked_getitemleaf 0.1919ms 8.6370μs 115.7814 KOps/s 117.5585 KOps/s $\color{#d91a1a}-1.51\%$
test_stacked_getitem 22.6810μs 8.0391μs 124.3925 KOps/s 125.0298 KOps/s $\color{#d91a1a}-0.51\%$
test_lock_nested 61.6632ms 0.4051ms 2.4684 KOps/s 2.4201 KOps/s $\color{#35bf28}+2.00\%$
test_lock_stack_nested 0.4181ms 0.2915ms 3.4305 KOps/s 3.3062 KOps/s $\color{#35bf28}+3.76\%$
test_unlock_nested 63.3505ms 0.3999ms 2.5006 KOps/s 2.3870 KOps/s $\color{#35bf28}+4.76\%$
test_unlock_stack_nested 0.4164ms 0.2994ms 3.3396 KOps/s 3.2149 KOps/s $\color{#35bf28}+3.88\%$
test_flatten_speed 0.5132ms 0.1029ms 9.7139 KOps/s 9.8853 KOps/s $\color{#d91a1a}-1.73\%$
test_unflatten_speed 0.3180ms 0.2967ms 3.3699 KOps/s 3.4329 KOps/s $\color{#d91a1a}-1.83\%$
test_common_ops 1.0274ms 0.5525ms 1.8099 KOps/s 1.7538 KOps/s $\color{#35bf28}+3.20\%$
test_creation 20.0800μs 1.6174μs 618.2773 KOps/s 626.1634 KOps/s $\color{#d91a1a}-1.26\%$
test_creation_empty 22.8300μs 7.1454μs 139.9502 KOps/s 147.7475 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_creation_nested_1 27.5800μs 8.8680μs 112.7645 KOps/s 116.1541 KOps/s $\color{#d91a1a}-2.92\%$
test_creation_nested_2 32.4210μs 10.9540μs 91.2910 KOps/s 93.1308 KOps/s $\color{#d91a1a}-1.98\%$
test_clone 97.9220μs 11.8826μs 84.1570 KOps/s 76.7796 KOps/s $\textbf{\color{#35bf28}+9.61\%}$
test_getitem[int] 25.5810μs 10.8455μs 92.2043 KOps/s 87.5244 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_getitem[slice_int] 0.1523ms 21.1017μs 47.3896 KOps/s 45.5933 KOps/s $\color{#35bf28}+3.94\%$
test_getitem[range] 67.7110μs 49.4730μs 20.2130 KOps/s 20.0856 KOps/s $\color{#35bf28}+0.63\%$
test_getitem[tuple] 39.5910μs 18.3851μs 54.3920 KOps/s 51.3167 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_getitem[list] 0.1691ms 33.9860μs 29.4239 KOps/s 28.1049 KOps/s $\color{#35bf28}+4.69\%$
test_setitem_dim[int] 53.4610μs 29.6050μs 33.7781 KOps/s 36.7714 KOps/s $\textbf{\color{#d91a1a}-8.14\%}$
test_setitem_dim[slice_int] 0.1723ms 50.9635μs 19.6219 KOps/s 20.7680 KOps/s $\textbf{\color{#d91a1a}-5.52\%}$
test_setitem_dim[range] 86.9410μs 67.0836μs 14.9068 KOps/s 15.2242 KOps/s $\color{#d91a1a}-2.08\%$
test_setitem_dim[tuple] 0.1465ms 42.7212μs 23.4076 KOps/s 24.2776 KOps/s $\color{#d91a1a}-3.58\%$
test_setitem 41.3710μs 15.5260μs 64.4080 KOps/s 59.4785 KOps/s $\textbf{\color{#35bf28}+8.29\%}$
test_set 0.1652ms 14.9818μs 66.7477 KOps/s 61.6522 KOps/s $\textbf{\color{#35bf28}+8.26\%}$
test_set_shared 1.8044ms 99.0431μs 10.0966 KOps/s 9.8727 KOps/s $\color{#35bf28}+2.27\%$
test_update 0.1388ms 17.3731μs 57.5604 KOps/s 55.0756 KOps/s $\color{#35bf28}+4.51\%$
test_update_nested 70.2610μs 23.0550μs 43.3744 KOps/s 42.5917 KOps/s $\color{#35bf28}+1.84\%$
test_update__nested 0.1155ms 23.1032μs 43.2841 KOps/s 40.5003 KOps/s $\textbf{\color{#35bf28}+6.87\%}$
test_set_nested 58.9810μs 16.1902μs 61.7656 KOps/s 57.8917 KOps/s $\textbf{\color{#35bf28}+6.69\%}$
test_set_nested_new 58.0810μs 18.6095μs 53.7360 KOps/s 50.0982 KOps/s $\textbf{\color{#35bf28}+7.26\%}$
test_select 68.6410μs 31.0666μs 32.1889 KOps/s 30.1536 KOps/s $\textbf{\color{#35bf28}+6.75\%}$
test_select_nested 0.6731ms 51.0178μs 19.6010 KOps/s 18.5592 KOps/s $\textbf{\color{#35bf28}+5.61\%}$
test_exclude_nested 0.1274ms 0.1060ms 9.4304 KOps/s 9.0864 KOps/s $\color{#35bf28}+3.79\%$
test_empty[True] 0.3626ms 0.3439ms 2.9077 KOps/s 2.8970 KOps/s $\color{#35bf28}+0.37\%$
test_empty[False] 2.3681μs 0.7949μs 1.2581 MOps/s 1.0425 MOps/s $\textbf{\color{#35bf28}+20.68\%}$
test_to 89.8320μs 60.0960μs 16.6400 KOps/s 12.6320 KOps/s $\textbf{\color{#35bf28}+31.73\%}$
test_to_nonblocking 0.1882ms 36.8023μs 27.1722 KOps/s 15.7843 KOps/s $\textbf{\color{#35bf28}+72.15\%}$
test_unbind_speed 0.3848ms 0.2522ms 3.9644 KOps/s 3.6694 KOps/s $\textbf{\color{#35bf28}+8.04\%}$
test_unbind_speed_stack0 0.4365ms 0.2536ms 3.9426 KOps/s 3.7259 KOps/s $\textbf{\color{#35bf28}+5.82\%}$
test_unbind_speed_stack1 80.3425ms 0.7816ms 1.2795 KOps/s 1.2257 KOps/s $\color{#35bf28}+4.39\%$
test_split 80.9181ms 1.7195ms 581.5723 Ops/s 553.3714 Ops/s $\textbf{\color{#35bf28}+5.10\%}$
test_chunk 1.7900ms 1.5850ms 630.9189 Ops/s 601.1827 Ops/s $\color{#35bf28}+4.95\%$
test_creation[device0] 0.2061ms 57.4640μs 17.4022 KOps/s 17.1146 KOps/s $\color{#35bf28}+1.68\%$
test_creation_from_tensor 0.2269ms 55.2570μs 18.0973 KOps/s 18.3337 KOps/s $\color{#d91a1a}-1.29\%$
test_add_one[memmap_tensor0] 95.9210μs 7.3471μs 136.1077 KOps/s 126.4104 KOps/s $\textbf{\color{#35bf28}+7.67\%}$
test_contiguous[memmap_tensor0] 18.8800μs 0.6699μs 1.4928 MOps/s 1.4951 MOps/s $\color{#d91a1a}-0.15\%$
test_stack[memmap_tensor0] 30.1810μs 5.0470μs 198.1387 KOps/s 177.6881 KOps/s $\textbf{\color{#35bf28}+11.51\%}$
test_memmaptd_index 80.3292ms 0.3836ms 2.6069 KOps/s 3.1752 KOps/s $\textbf{\color{#d91a1a}-17.90\%}$
test_memmaptd_index_astensor 0.6285ms 0.3446ms 2.9018 KOps/s 2.5870 KOps/s $\textbf{\color{#35bf28}+12.17\%}$
test_memmaptd_index_op 0.9169ms 0.6275ms 1.5936 KOps/s 1.4400 KOps/s $\textbf{\color{#35bf28}+10.66\%}$
test_serialize_model 99.3285ms 94.8956ms 10.5379 Ops/s 10.0873 Ops/s $\color{#35bf28}+4.47\%$
test_serialize_model_pickle 1.3471s 1.2358s 0.8092 Ops/s 0.8086 Ops/s $\color{#35bf28}+0.07\%$
test_serialize_weights 0.1792s 0.1024s 9.7631 Ops/s 9.1499 Ops/s $\textbf{\color{#35bf28}+6.70\%}$
test_serialize_weights_returnearly 0.2853s 88.4362ms 11.3076 Ops/s 14.5608 Ops/s $\textbf{\color{#d91a1a}-22.34\%}$
test_serialize_weights_pickle 1.3495s 1.2353s 0.8095 Ops/s 0.8088 Ops/s $\color{#35bf28}+0.09\%$
test_reshape_pytree 56.4610μs 26.5299μs 37.6934 KOps/s 37.8630 KOps/s $\color{#d91a1a}-0.45\%$
test_reshape_td 0.1555ms 31.2715μs 31.9780 KOps/s 31.5759 KOps/s $\color{#35bf28}+1.27\%$
test_view_pytree 0.1735ms 26.3138μs 38.0029 KOps/s 38.4678 KOps/s $\color{#d91a1a}-1.21\%$
test_view_td 67.1910μs 35.5268μs 28.1478 KOps/s 26.8138 KOps/s $\color{#35bf28}+4.98\%$
test_unbind_pytree 73.7220μs 31.5785μs 31.6671 KOps/s 30.7892 KOps/s $\color{#35bf28}+2.85\%$
test_unbind_td 0.4406ms 39.4998μs 25.3166 KOps/s 23.7083 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_split_pytree 0.1556ms 35.9342μs 27.8286 KOps/s 26.7679 KOps/s $\color{#35bf28}+3.96\%$
test_split_td 0.1804ms 40.0559μs 24.9651 KOps/s 23.8012 KOps/s $\color{#35bf28}+4.89\%$
test_add_pytree 0.1736ms 40.3366μs 24.7914 KOps/s 24.8877 KOps/s $\color{#d91a1a}-0.39\%$
test_add_td 0.1876ms 52.0289μs 19.2201 KOps/s 20.1399 KOps/s $\color{#d91a1a}-4.57\%$
test_distributed 2.3497ms 75.9076μs 13.1739 KOps/s 10.5989 KOps/s $\textbf{\color{#35bf28}+24.29\%}$
test_tdmodule 28.3310μs 14.0226μs 71.3134 KOps/s 73.5420 KOps/s $\color{#d91a1a}-3.03\%$
test_tdmodule_dispatch 0.1611ms 27.7106μs 36.0873 KOps/s 36.8450 KOps/s $\color{#d91a1a}-2.06\%$
test_tdseq 25.5610μs 15.5234μs 64.4188 KOps/s 63.7447 KOps/s $\color{#35bf28}+1.06\%$
test_tdseq_dispatch 46.3110μs 30.2733μs 33.0324 KOps/s 33.0776 KOps/s $\color{#d91a1a}-0.14\%$
test_instantiation_functorch 1.5993ms 1.4296ms 699.5156 Ops/s 687.7319 Ops/s $\color{#35bf28}+1.71\%$
test_instantiation_td 1.4540ms 0.9827ms 1.0176 KOps/s 998.7197 Ops/s $\color{#35bf28}+1.89\%$
test_exec_functorch 0.2896ms 0.1454ms 6.8789 KOps/s 6.5852 KOps/s $\color{#35bf28}+4.46\%$
test_exec_functional_call 0.3389ms 0.1377ms 7.2618 KOps/s 7.1251 KOps/s $\color{#35bf28}+1.92\%$
test_exec_td 0.2962ms 0.1353ms 7.3884 KOps/s 7.1945 KOps/s $\color{#35bf28}+2.69\%$
test_exec_td_decorator 0.5822ms 0.2043ms 4.8949 KOps/s 4.7032 KOps/s $\color{#35bf28}+4.07\%$
test_vmap_mlp_speed[True-True] 0.7658ms 0.5746ms 1.7405 KOps/s 1.7254 KOps/s $\color{#35bf28}+0.87\%$
test_vmap_mlp_speed[True-False] 0.7437ms 0.5781ms 1.7298 KOps/s 1.7233 KOps/s $\color{#35bf28}+0.38\%$
test_vmap_mlp_speed[False-True] 0.6992ms 0.5139ms 1.9459 KOps/s 1.9396 KOps/s $\color{#35bf28}+0.33\%$
test_vmap_mlp_speed[False-False] 0.6983ms 0.5156ms 1.9393 KOps/s 1.9446 KOps/s $\color{#d91a1a}-0.27\%$
test_vmap_mlp_speed_decorator[True-True] 1.5814ms 0.6369ms 1.5701 KOps/s 1.5511 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_mlp_speed_decorator[True-False] 0.8117ms 0.6378ms 1.5678 KOps/s 1.5646 KOps/s $\color{#35bf28}+0.21\%$
test_vmap_mlp_speed_decorator[False-True] 0.7312ms 0.5732ms 1.7447 KOps/s 1.7547 KOps/s $\color{#d91a1a}-0.57\%$
test_vmap_mlp_speed_decorator[False-False] 0.7840ms 0.5630ms 1.7762 KOps/s 1.7564 KOps/s $\color{#35bf28}+1.12\%$
test_vmap_transformer_speed[True-True] 8.0988ms 7.6740ms 130.3098 Ops/s 128.4828 Ops/s $\color{#35bf28}+1.42\%$
test_vmap_transformer_speed[True-False] 7.8927ms 7.6349ms 130.9772 Ops/s 128.6842 Ops/s $\color{#35bf28}+1.78\%$
test_vmap_transformer_speed[False-True] 7.9596ms 7.5923ms 131.7122 Ops/s 128.8564 Ops/s $\color{#35bf28}+2.22\%$
test_vmap_transformer_speed[False-False] 8.0866ms 7.6460ms 130.7877 Ops/s 129.1978 Ops/s $\color{#35bf28}+1.23\%$
test_vmap_transformer_speed_decorator[True-True] 19.4575ms 18.7361ms 53.3729 Ops/s 52.6646 Ops/s $\color{#35bf28}+1.34\%$
test_vmap_transformer_speed_decorator[True-False] 19.5924ms 18.7688ms 53.2800 Ops/s 52.6122 Ops/s $\color{#35bf28}+1.27\%$
test_vmap_transformer_speed_decorator[False-True] 19.4503ms 18.6653ms 53.5755 Ops/s 52.8831 Ops/s $\color{#35bf28}+1.31\%$
test_vmap_transformer_speed_decorator[False-False] 19.5610ms 18.6410ms 53.6451 Ops/s 52.8930 Ops/s $\color{#35bf28}+1.42\%$
test_to_module_speed[True] 1.7952ms 1.4798ms 675.7497 Ops/s 659.9122 Ops/s $\color{#35bf28}+2.40\%$
test_to_module_speed[False] 1.5738ms 1.4727ms 679.0476 Ops/s 670.8579 Ops/s $\color{#35bf28}+1.22\%$
test_tc_init 96.8930μs 22.4361μs 44.5710 KOps/s 47.6597 KOps/s $\textbf{\color{#d91a1a}-6.48\%}$
test_tc_init_nested 71.1620μs 43.5710μs 22.9511 KOps/s 22.5499 KOps/s $\color{#35bf28}+1.78\%$
test_tc_first_layer_tensor 0.7630μs 0.3606μs 2.7731 MOps/s 2.8104 MOps/s $\color{#d91a1a}-1.33\%$
test_tc_first_layer_nontensor 4.9516μs 0.3953μs 2.5300 MOps/s 2.5959 MOps/s $\color{#d91a1a}-2.54\%$
test_tc_second_layer_tensor 15.6500μs 1.0815μs 924.6219 KOps/s 1.0106 MOps/s $\textbf{\color{#d91a1a}-8.51\%}$
test_tc_second_layer_nontensor 4.2400μs 0.8374μs 1.1942 MOps/s 1.2103 MOps/s $\color{#d91a1a}-1.33\%$
test_unbind 0.1124s 8.3827ms 119.2934 Ops/s 164.3177 Ops/s $\textbf{\color{#d91a1a}-27.40\%}$
test_full_like 14.5557ms 13.8155ms 72.3824 Ops/s 71.2242 Ops/s $\color{#35bf28}+1.63\%$
test_zeros_like 8.6442ms 7.9935ms 125.1009 Ops/s 124.2787 Ops/s $\color{#35bf28}+0.66\%$
test_ones_like 8.5396ms 8.0704ms 123.9090 Ops/s 123.3552 Ops/s $\color{#35bf28}+0.45\%$
test_clone 10.7874ms 9.9587ms 100.4149 Ops/s 100.6387 Ops/s $\color{#d91a1a}-0.22\%$
test_squeeze 0.1436ms 11.3146μs 88.3815 KOps/s 89.6299 KOps/s $\color{#d91a1a}-1.39\%$
test_unsqueeze 0.1856ms 51.7905μs 19.3085 KOps/s 18.9434 KOps/s $\color{#35bf28}+1.93\%$
test_split 0.2583ms 99.6921μs 10.0309 KOps/s 9.8180 KOps/s $\color{#35bf28}+2.17\%$
test_permute 0.1936ms 0.1099ms 9.1019 KOps/s 8.8721 KOps/s $\color{#35bf28}+2.59\%$
test_stack 29.2820ms 28.4766ms 35.1165 Ops/s 35.3405 Ops/s $\color{#d91a1a}-0.63\%$
test_cat 28.9358ms 28.3193ms 35.3116 Ops/s 35.3486 Ops/s $\color{#d91a1a}-0.10\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants