Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Change default interaction types to DETERMINISTIC #825

Merged
merged 1 commit into from
Jun 21, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 21, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 21, 2024
@vmoens vmoens added enhancement New feature or request and removed CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. labels Jun 21, 2024
@vmoens vmoens merged commit ab1abac into main Jun 21, 2024
25 of 30 checks passed
@vmoens vmoens deleted the default-interaction-mode branch June 21, 2024 08:11
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 30.3970μs 17.0511μs 58.6473 KOps/s 57.5688 KOps/s $\color{#35bf28}+1.87\%$
test_plain_set_stack_nested 63.1570μs 17.3149μs 57.7538 KOps/s 56.3666 KOps/s $\color{#35bf28}+2.46\%$
test_plain_set_nested_inplace 61.1940μs 19.1002μs 52.3555 KOps/s 50.7869 KOps/s $\color{#35bf28}+3.09\%$
test_plain_set_stack_nested_inplace 67.5860μs 19.1649μs 52.1786 KOps/s 50.8353 KOps/s $\color{#35bf28}+2.64\%$
test_items 13.2650μs 2.8346μs 352.7841 KOps/s 376.8298 KOps/s $\textbf{\color{#d91a1a}-6.38\%}$
test_items_nested 0.4471ms 0.2662ms 3.7571 KOps/s 3.7372 KOps/s $\color{#35bf28}+0.53\%$
test_items_nested_locked 2.2593ms 0.2717ms 3.6810 KOps/s 3.7796 KOps/s $\color{#d91a1a}-2.61\%$
test_items_nested_leaf 0.1324ms 75.6307μs 13.2221 KOps/s 12.6381 KOps/s $\color{#35bf28}+4.62\%$
test_items_stack_nested 0.5698ms 0.2691ms 3.7166 KOps/s 3.7600 KOps/s $\color{#d91a1a}-1.16\%$
test_items_stack_nested_leaf 0.1541ms 75.9287μs 13.1703 KOps/s 13.0427 KOps/s $\color{#35bf28}+0.98\%$
test_items_stack_nested_locked 0.9212ms 0.2687ms 3.7209 KOps/s 3.7852 KOps/s $\color{#d91a1a}-1.70\%$
test_keys 22.0710μs 3.8601μs 259.0575 KOps/s 240.7324 KOps/s $\textbf{\color{#35bf28}+7.61\%}$
test_keys_nested 0.2721ms 0.1360ms 7.3518 KOps/s 7.2977 KOps/s $\color{#35bf28}+0.74\%$
test_keys_nested_locked 0.6960ms 0.1412ms 7.0832 KOps/s 7.0365 KOps/s $\color{#35bf28}+0.66\%$
test_keys_nested_leaf 0.2356ms 0.1171ms 8.5376 KOps/s 8.5102 KOps/s $\color{#35bf28}+0.32\%$
test_keys_stack_nested 0.2762ms 0.1359ms 7.3560 KOps/s 7.2714 KOps/s $\color{#35bf28}+1.16\%$
test_keys_stack_nested_leaf 0.2000ms 0.1158ms 8.6345 KOps/s 8.5430 KOps/s $\color{#35bf28}+1.07\%$
test_keys_stack_nested_locked 0.2824ms 0.1403ms 7.1285 KOps/s 7.0644 KOps/s $\color{#35bf28}+0.91\%$
test_values 8.1172μs 1.1461μs 872.5176 KOps/s 830.8407 KOps/s $\textbf{\color{#35bf28}+5.02\%}$
test_values_nested 0.1029ms 50.7505μs 19.7042 KOps/s 19.6187 KOps/s $\color{#35bf28}+0.44\%$
test_values_nested_locked 91.2900μs 50.2955μs 19.8825 KOps/s 19.7855 KOps/s $\color{#35bf28}+0.49\%$
test_values_nested_leaf 92.3920μs 45.9784μs 21.7493 KOps/s 21.6910 KOps/s $\color{#35bf28}+0.27\%$
test_values_stack_nested 0.1308ms 51.2057μs 19.5291 KOps/s 19.4763 KOps/s $\color{#35bf28}+0.27\%$
test_values_stack_nested_leaf 91.9910μs 45.0029μs 22.2208 KOps/s 21.5930 KOps/s $\color{#35bf28}+2.91\%$
test_values_stack_nested_locked 0.1578ms 51.4493μs 19.4366 KOps/s 19.5708 KOps/s $\color{#d91a1a}-0.69\%$
test_membership 93.2140μs 1.3328μs 750.2900 KOps/s 752.4965 KOps/s $\color{#d91a1a}-0.29\%$
test_membership_nested 41.2170μs 3.5227μs 283.8758 KOps/s 287.3479 KOps/s $\color{#d91a1a}-1.21\%$
test_membership_nested_leaf 25.2670μs 3.5213μs 283.9851 KOps/s 290.6875 KOps/s $\color{#d91a1a}-2.31\%$
test_membership_stacked_nested 44.3730μs 3.4786μs 287.4743 KOps/s 295.1823 KOps/s $\color{#d91a1a}-2.61\%$
test_membership_stacked_nested_leaf 22.0010μs 3.4621μs 288.8441 KOps/s 288.2037 KOps/s $\color{#35bf28}+0.22\%$
test_membership_nested_last 38.0410μs 4.3250μs 231.2143 KOps/s 242.2479 KOps/s $\color{#d91a1a}-4.55\%$
test_membership_nested_leaf_last 31.3480μs 4.3341μs 230.7275 KOps/s 239.5328 KOps/s $\color{#d91a1a}-3.68\%$
test_membership_stacked_nested_last 45.4150μs 4.3294μs 230.9813 KOps/s 238.1456 KOps/s $\color{#d91a1a}-3.01\%$
test_membership_stacked_nested_leaf_last 25.2880μs 4.3185μs 231.5596 KOps/s 238.1788 KOps/s $\color{#d91a1a}-2.78\%$
test_nested_getleaf 97.5920μs 10.7907μs 92.6724 KOps/s 95.1184 KOps/s $\color{#d91a1a}-2.57\%$
test_nested_get 0.1427ms 10.1655μs 98.3723 KOps/s 100.1211 KOps/s $\color{#d91a1a}-1.75\%$
test_stacked_getleaf 36.1680μs 10.5292μs 94.9738 KOps/s 94.9928 KOps/s $\color{#d91a1a}-0.02\%$
test_stacked_get 46.9280μs 9.8649μs 101.3690 KOps/s 100.1780 KOps/s $\color{#35bf28}+1.19\%$
test_nested_getitemleaf 52.1670μs 11.2614μs 88.7991 KOps/s 88.8608 KOps/s $\color{#d91a1a}-0.07\%$
test_nested_getitem 76.4420μs 10.2625μs 97.4421 KOps/s 95.7733 KOps/s $\color{#35bf28}+1.74\%$
test_stacked_getitemleaf 42.7400μs 11.1620μs 89.5897 KOps/s 88.7796 KOps/s $\color{#35bf28}+0.91\%$
test_stacked_getitem 51.8770μs 10.1855μs 98.1787 KOps/s 96.0697 KOps/s $\color{#35bf28}+2.20\%$
test_lock_nested 50.9963ms 0.3887ms 2.5729 KOps/s 2.9981 KOps/s $\textbf{\color{#d91a1a}-14.18\%}$
test_lock_stack_nested 0.4275ms 0.3002ms 3.3308 KOps/s 3.2377 KOps/s $\color{#35bf28}+2.87\%$
test_unlock_nested 0.6783ms 0.3443ms 2.9041 KOps/s 2.9179 KOps/s $\color{#d91a1a}-0.47\%$
test_unlock_stack_nested 0.6558ms 0.3107ms 3.2187 KOps/s 3.1366 KOps/s $\color{#35bf28}+2.62\%$
test_flatten_speed 0.1792ms 96.6575μs 10.3458 KOps/s 10.5485 KOps/s $\color{#d91a1a}-1.92\%$
test_unflatten_speed 0.6996ms 0.4189ms 2.3869 KOps/s 2.4465 KOps/s $\color{#d91a1a}-2.43\%$
test_common_ops 1.5518ms 0.7375ms 1.3559 KOps/s 1.3756 KOps/s $\color{#d91a1a}-1.43\%$
test_creation 21.5000μs 1.9142μs 522.4181 KOps/s 519.9199 KOps/s $\color{#35bf28}+0.48\%$
test_creation_empty 35.5260μs 10.8835μs 91.8822 KOps/s 89.6908 KOps/s $\color{#35bf28}+2.44\%$
test_creation_nested_1 33.7230μs 13.5958μs 73.5521 KOps/s 71.9484 KOps/s $\color{#35bf28}+2.23\%$
test_creation_nested_2 55.0330μs 17.1381μs 58.3494 KOps/s 57.6918 KOps/s $\color{#35bf28}+1.14\%$
test_clone 0.1391ms 13.3397μs 74.9644 KOps/s 75.2847 KOps/s $\color{#d91a1a}-0.43\%$
test_getitem[int] 38.3110μs 11.7188μs 85.3326 KOps/s 89.2648 KOps/s $\color{#d91a1a}-4.41\%$
test_getitem[slice_int] 73.9480μs 23.0426μs 43.3979 KOps/s 45.8516 KOps/s $\textbf{\color{#d91a1a}-5.35\%}$
test_getitem[range] 77.0640μs 57.0244μs 17.5364 KOps/s 14.7867 KOps/s $\textbf{\color{#35bf28}+18.60\%}$
test_getitem[tuple] 43.6810μs 19.4944μs 51.2969 KOps/s 54.5857 KOps/s $\textbf{\color{#d91a1a}-6.03\%}$
test_getitem[list] 0.1579ms 40.6120μs 24.6233 KOps/s 24.7993 KOps/s $\color{#d91a1a}-0.71\%$
test_setitem_dim[int] 59.4410μs 35.4649μs 28.1969 KOps/s 29.1502 KOps/s $\color{#d91a1a}-3.27\%$
test_setitem_dim[slice_int] 0.1397ms 61.8696μs 16.1630 KOps/s 16.1065 KOps/s $\color{#35bf28}+0.35\%$
test_setitem_dim[range] 0.1603ms 83.4173μs 11.9879 KOps/s 11.8076 KOps/s $\color{#35bf28}+1.53\%$
test_setitem_dim[tuple] 89.3370μs 49.6863μs 20.1263 KOps/s 19.9622 KOps/s $\color{#35bf28}+0.82\%$
test_setitem 53.4890μs 20.1133μs 49.7182 KOps/s 48.4864 KOps/s $\color{#35bf28}+2.54\%$
test_set 69.6900μs 19.8829μs 50.2944 KOps/s 49.2510 KOps/s $\color{#35bf28}+2.12\%$
test_set_shared 3.6468ms 0.1417ms 7.0556 KOps/s 7.1756 KOps/s $\color{#d91a1a}-1.67\%$
test_update 95.9980μs 22.8100μs 43.8404 KOps/s 43.4505 KOps/s $\color{#35bf28}+0.90\%$
test_update_nested 85.3490μs 31.6305μs 31.6150 KOps/s 31.8858 KOps/s $\color{#d91a1a}-0.85\%$
test_update__nested 93.9050μs 25.2320μs 39.6322 KOps/s 39.7980 KOps/s $\color{#d91a1a}-0.42\%$
test_set_nested 59.4210μs 21.5739μs 46.3522 KOps/s 45.8638 KOps/s $\color{#35bf28}+1.06\%$
test_set_nested_new 70.3910μs 25.9400μs 38.5505 KOps/s 38.3291 KOps/s $\color{#35bf28}+0.58\%$
test_select 0.1161ms 41.0725μs 24.3472 KOps/s 24.4869 KOps/s $\color{#d91a1a}-0.57\%$
test_select_nested 0.1627ms 60.4248μs 16.5495 KOps/s 16.3500 KOps/s $\color{#35bf28}+1.22\%$
test_exclude_nested 0.2199ms 0.1191ms 8.3954 KOps/s 8.3542 KOps/s $\color{#35bf28}+0.49\%$
test_empty[True] 0.7033ms 0.3901ms 2.5631 KOps/s 2.5349 KOps/s $\color{#35bf28}+1.11\%$
test_empty[False] 11.0455μs 1.1667μs 857.1142 KOps/s 866.5278 KOps/s $\color{#d91a1a}-1.09\%$
test_unbind_speed 0.3187ms 0.2542ms 3.9340 KOps/s 3.9255 KOps/s $\color{#35bf28}+0.22\%$
test_unbind_speed_stack0 0.3299ms 0.2435ms 4.1064 KOps/s 3.9308 KOps/s $\color{#35bf28}+4.47\%$
test_unbind_speed_stack1 1.1950ms 0.6246ms 1.6010 KOps/s 1.3168 KOps/s $\textbf{\color{#35bf28}+21.58\%}$
test_split 67.1306ms 1.5945ms 627.1655 Ops/s 625.2104 Ops/s $\color{#35bf28}+0.31\%$
test_chunk 65.7969ms 1.5996ms 625.1369 Ops/s 626.3175 Ops/s $\color{#d91a1a}-0.19\%$
test_creation[device0] 4.6398ms 84.5706μs 11.8244 KOps/s 12.0568 KOps/s $\color{#d91a1a}-1.93\%$
test_creation_from_tensor 0.1912ms 83.5236μs 11.9727 KOps/s 11.7420 KOps/s $\color{#35bf28}+1.96\%$
test_add_one[memmap_tensor0] 53.2090μs 5.3879μs 185.5994 KOps/s 190.3449 KOps/s $\color{#d91a1a}-2.49\%$
test_contiguous[memmap_tensor0] 29.4210μs 0.6322μs 1.5818 MOps/s 1.5596 MOps/s $\color{#35bf28}+1.42\%$
test_stack[memmap_tensor0] 37.1090μs 3.7074μs 269.7291 KOps/s 284.1295 KOps/s $\textbf{\color{#d91a1a}-5.07\%}$
test_memmaptd_index 0.9078ms 0.2573ms 3.8873 KOps/s 3.9758 KOps/s $\color{#d91a1a}-2.23\%$
test_memmaptd_index_astensor 0.7188ms 0.3292ms 3.0378 KOps/s 3.0763 KOps/s $\color{#d91a1a}-1.25\%$
test_memmaptd_index_op 0.9753ms 0.6134ms 1.6302 KOps/s 1.6387 KOps/s $\color{#d91a1a}-0.52\%$
test_serialize_model 0.1119s 0.1040s 9.6157 Ops/s 8.8787 Ops/s $\textbf{\color{#35bf28}+8.30\%}$
test_serialize_model_pickle 0.4484s 0.3754s 2.6641 Ops/s 2.6414 Ops/s $\color{#35bf28}+0.86\%$
test_serialize_weights 0.1678s 0.1138s 8.7864 Ops/s 8.9122 Ops/s $\color{#d91a1a}-1.41\%$
test_serialize_weights_returnearly 0.1880s 0.1311s 7.6289 Ops/s 8.0803 Ops/s $\textbf{\color{#d91a1a}-5.59\%}$
test_serialize_weights_pickle 0.7521s 0.5095s 1.9626 Ops/s 2.3430 Ops/s $\textbf{\color{#d91a1a}-16.24\%}$
test_serialize_weights_filesystem 97.6410ms 92.5245ms 10.8079 Ops/s 10.4176 Ops/s $\color{#35bf28}+3.75\%$
test_serialize_model_filesystem 98.6178ms 92.6198ms 10.7968 Ops/s 10.6114 Ops/s $\color{#35bf28}+1.75\%$
test_reshape_pytree 62.1050μs 25.6393μs 39.0026 KOps/s 39.0973 KOps/s $\color{#d91a1a}-0.24\%$
test_reshape_td 78.7870μs 35.0956μs 28.4936 KOps/s 29.3923 KOps/s $\color{#d91a1a}-3.06\%$
test_view_pytree 62.3760μs 25.2405μs 39.6189 KOps/s 38.7714 KOps/s $\color{#35bf28}+2.19\%$
test_view_td 90.1980μs 38.7601μs 25.7997 KOps/s 25.6774 KOps/s $\color{#35bf28}+0.48\%$
test_unbind_pytree 78.9070μs 29.5966μs 33.7877 KOps/s 34.0313 KOps/s $\color{#d91a1a}-0.72\%$
test_unbind_td 0.3709ms 37.8822μs 26.3976 KOps/s 26.1741 KOps/s $\color{#35bf28}+0.85\%$
test_split_pytree 70.3810μs 29.3721μs 34.0460 KOps/s 33.8826 KOps/s $\color{#35bf28}+0.48\%$
test_split_td 0.1190ms 41.0382μs 24.3676 KOps/s 24.5853 KOps/s $\color{#d91a1a}-0.89\%$
test_add_pytree 96.0390μs 34.6705μs 28.8430 KOps/s 28.7699 KOps/s $\color{#35bf28}+0.25\%$
test_add_td 0.1104ms 54.1784μs 18.4575 KOps/s 17.9175 KOps/s $\color{#35bf28}+3.01\%$
test_distributed 0.1994ms 99.9925μs 10.0007 KOps/s 9.7699 KOps/s $\color{#35bf28}+2.36\%$
test_tdmodule 59.9810μs 18.4235μs 54.2786 KOps/s 53.2766 KOps/s $\color{#35bf28}+1.88\%$
test_tdmodule_dispatch 63.9290μs 34.9672μs 28.5982 KOps/s 28.2037 KOps/s $\color{#35bf28}+1.40\%$
test_tdseq 43.3900μs 20.6496μs 48.4271 KOps/s 49.7092 KOps/s $\color{#d91a1a}-2.58\%$
test_tdseq_dispatch 79.1180μs 40.5120μs 24.6841 KOps/s 24.9794 KOps/s $\color{#d91a1a}-1.18\%$
test_instantiation_functorch 1.7359ms 1.3082ms 764.4043 Ops/s 764.1546 Ops/s $\color{#35bf28}+0.03\%$
test_instantiation_td 1.9452ms 1.0337ms 967.3934 Ops/s 998.1335 Ops/s $\color{#d91a1a}-3.08\%$
test_exec_functorch 0.3655ms 0.1617ms 6.1829 KOps/s 6.2692 KOps/s $\color{#d91a1a}-1.38\%$
test_exec_functional_call 0.2968ms 0.1505ms 6.6425 KOps/s 6.8748 KOps/s $\color{#d91a1a}-3.38\%$
test_exec_td 0.2429ms 0.1446ms 6.9178 KOps/s 7.0017 KOps/s $\color{#d91a1a}-1.20\%$
test_exec_td_decorator 1.0128ms 0.2211ms 4.5223 KOps/s 4.5659 KOps/s $\color{#d91a1a}-0.96\%$
test_vmap_mlp_speed[True-True] 0.7957ms 0.4890ms 2.0449 KOps/s 2.1119 KOps/s $\color{#d91a1a}-3.17\%$
test_vmap_mlp_speed[True-False] 0.6109ms 0.4805ms 2.0812 KOps/s 2.1154 KOps/s $\color{#d91a1a}-1.62\%$
test_vmap_mlp_speed[False-True] 0.6544ms 0.3942ms 2.5368 KOps/s 2.6055 KOps/s $\color{#d91a1a}-2.64\%$
test_vmap_mlp_speed[False-False] 0.7364ms 0.3946ms 2.5342 KOps/s 2.6063 KOps/s $\color{#d91a1a}-2.77\%$
test_vmap_mlp_speed_decorator[True-True] 1.0571ms 0.5505ms 1.8166 KOps/s 1.8292 KOps/s $\color{#d91a1a}-0.69\%$
test_vmap_mlp_speed_decorator[True-False] 0.8667ms 0.5498ms 1.8190 KOps/s 1.8324 KOps/s $\color{#d91a1a}-0.74\%$
test_vmap_mlp_speed_decorator[False-True] 0.8395ms 0.4543ms 2.2010 KOps/s 2.2129 KOps/s $\color{#d91a1a}-0.54\%$
test_vmap_mlp_speed_decorator[False-False] 0.5468ms 0.4491ms 2.2265 KOps/s 2.0842 KOps/s $\textbf{\color{#35bf28}+6.83\%}$
test_to_module_speed[True] 1.8147ms 1.6544ms 604.4615 Ops/s 589.1111 Ops/s $\color{#35bf28}+2.61\%$
test_to_module_speed[False] 2.6352ms 1.6509ms 605.7475 Ops/s 602.7772 Ops/s $\color{#35bf28}+0.49\%$
test_tc_init 79.0270μs 29.2569μs 34.1800 KOps/s 33.0124 KOps/s $\color{#35bf28}+3.54\%$
test_tc_init_nested 0.1358ms 61.7045μs 16.2063 KOps/s 15.9075 KOps/s $\color{#35bf28}+1.88\%$
test_tc_first_layer_tensor 5.3013μs 0.7288μs 1.3722 MOps/s 1.4152 MOps/s $\color{#d91a1a}-3.04\%$
test_tc_first_layer_nontensor 2.8048μs 0.7041μs 1.4202 MOps/s 1.4585 MOps/s $\color{#d91a1a}-2.63\%$
test_tc_second_layer_tensor 41.1560μs 1.9156μs 522.0262 KOps/s 548.2794 KOps/s $\color{#d91a1a}-4.79\%$
test_tc_second_layer_nontensor 14.3467μs 1.5780μs 633.7135 KOps/s 652.3074 KOps/s $\color{#d91a1a}-2.85\%$
test_unbind 77.9047ms 6.9218ms 144.4721 Ops/s 155.0452 Ops/s $\textbf{\color{#d91a1a}-6.82\%}$
test_full_like 17.1857ms 10.5727ms 94.5832 Ops/s 89.2780 Ops/s $\textbf{\color{#35bf28}+5.94\%}$
test_zeros_like 12.8766ms 5.8397ms 171.2429 Ops/s 155.2988 Ops/s $\textbf{\color{#35bf28}+10.27\%}$
test_ones_like 13.6867ms 5.9834ms 167.1287 Ops/s 164.3212 Ops/s $\color{#35bf28}+1.71\%$
test_clone 14.2848ms 7.8669ms 127.1150 Ops/s 129.8949 Ops/s $\color{#d91a1a}-2.14\%$
test_squeeze 80.5460μs 14.0284μs 71.2839 KOps/s 69.4391 KOps/s $\color{#35bf28}+2.66\%$
test_unsqueeze 0.1268ms 59.4007μs 16.8348 KOps/s 16.4072 KOps/s $\color{#35bf28}+2.61\%$
test_split 0.2024ms 0.1110ms 9.0130 KOps/s 8.9330 KOps/s $\color{#35bf28}+0.90\%$
test_permute 0.2467ms 0.1291ms 7.7465 KOps/s 7.8261 KOps/s $\color{#d91a1a}-1.02\%$
test_stack 26.5159ms 21.9803ms 45.4954 Ops/s 45.3817 Ops/s $\color{#35bf28}+0.25\%$
test_cat 29.3083ms 21.9382ms 45.5826 Ops/s 45.7830 Ops/s $\color{#d91a1a}-0.44\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 29.4810μs 12.5405μs 79.7416 KOps/s 81.5303 KOps/s $\color{#d91a1a}-2.19\%$
test_plain_set_stack_nested 29.3620μs 12.6424μs 79.0987 KOps/s 80.4862 KOps/s $\color{#d91a1a}-1.72\%$
test_plain_set_nested_inplace 37.3620μs 13.9723μs 71.5700 KOps/s 73.1413 KOps/s $\color{#d91a1a}-2.15\%$
test_plain_set_stack_nested_inplace 38.0520μs 13.9998μs 71.4298 KOps/s 73.4070 KOps/s $\color{#d91a1a}-2.69\%$
test_items 21.3810μs 4.7844μs 209.0138 KOps/s 212.2155 KOps/s $\color{#d91a1a}-1.51\%$
test_items_nested 0.3774ms 0.3408ms 2.9344 KOps/s 2.9350 KOps/s $\color{#d91a1a}-0.02\%$
test_items_nested_locked 0.3711ms 0.3412ms 2.9312 KOps/s 2.9362 KOps/s $\color{#d91a1a}-0.17\%$
test_items_nested_leaf 0.1019ms 83.8194μs 11.9304 KOps/s 12.0114 KOps/s $\color{#d91a1a}-0.67\%$
test_items_stack_nested 0.3720ms 0.3427ms 2.9181 KOps/s 2.8981 KOps/s $\color{#35bf28}+0.69\%$
test_items_stack_nested_leaf 0.1068ms 83.0138μs 12.0462 KOps/s 11.8414 KOps/s $\color{#35bf28}+1.73\%$
test_items_stack_nested_locked 0.3682ms 0.3421ms 2.9229 KOps/s 2.9215 KOps/s $\color{#35bf28}+0.05\%$
test_keys 21.5310μs 4.3619μs 229.2593 KOps/s 230.4610 KOps/s $\color{#d91a1a}-0.52\%$
test_keys_nested 94.8050μs 66.7887μs 14.9726 KOps/s 14.8257 KOps/s $\color{#35bf28}+0.99\%$
test_keys_nested_locked 2.2985ms 71.3952μs 14.0065 KOps/s 13.6095 KOps/s $\color{#35bf28}+2.92\%$
test_keys_nested_leaf 78.2240μs 57.0279μs 17.5353 KOps/s 17.2201 KOps/s $\color{#35bf28}+1.83\%$
test_keys_stack_nested 82.7650μs 65.7934μs 15.1991 KOps/s 14.7346 KOps/s $\color{#35bf28}+3.15\%$
test_keys_stack_nested_leaf 80.0050μs 56.6508μs 17.6520 KOps/s 17.1560 KOps/s $\color{#35bf28}+2.89\%$
test_keys_stack_nested_locked 95.0760μs 71.0325μs 14.0781 KOps/s 13.8516 KOps/s $\color{#35bf28}+1.64\%$
test_values 8.5970μs 1.8185μs 549.8889 KOps/s 550.0303 KOps/s $\color{#d91a1a}-0.03\%$
test_values_nested 77.8750μs 35.5330μs 28.1429 KOps/s 28.0228 KOps/s $\color{#35bf28}+0.43\%$
test_values_nested_locked 58.0730μs 37.3110μs 26.8018 KOps/s 26.4840 KOps/s $\color{#35bf28}+1.20\%$
test_values_nested_leaf 54.2730μs 31.9597μs 31.2894 KOps/s 31.6171 KOps/s $\color{#d91a1a}-1.04\%$
test_values_stack_nested 57.9630μs 36.4451μs 27.4385 KOps/s 27.5727 KOps/s $\color{#d91a1a}-0.49\%$
test_values_stack_nested_leaf 66.1940μs 31.9893μs 31.2605 KOps/s 30.9496 KOps/s $\color{#35bf28}+1.00\%$
test_values_stack_nested_locked 58.2640μs 38.1734μs 26.1962 KOps/s 26.2072 KOps/s $\color{#d91a1a}-0.04\%$
test_membership 3.9016μs 0.7326μs 1.3650 MOps/s 1.3679 MOps/s $\color{#d91a1a}-0.21\%$
test_membership_nested 27.0720μs 2.6459μs 377.9499 KOps/s 389.1557 KOps/s $\color{#d91a1a}-2.88\%$
test_membership_nested_leaf 16.8210μs 2.6277μs 380.5618 KOps/s 389.3595 KOps/s $\color{#d91a1a}-2.26\%$
test_membership_stacked_nested 24.5220μs 2.6822μs 372.8317 KOps/s 388.8428 KOps/s $\color{#d91a1a}-4.12\%$
test_membership_stacked_nested_leaf 36.9830μs 2.6544μs 376.7392 KOps/s 393.5230 KOps/s $\color{#d91a1a}-4.27\%$
test_membership_nested_last 18.2010μs 3.1882μs 313.6520 KOps/s 319.9859 KOps/s $\color{#d91a1a}-1.98\%$
test_membership_nested_leaf_last 21.8220μs 3.1811μs 314.3596 KOps/s 318.5318 KOps/s $\color{#d91a1a}-1.31\%$
test_membership_stacked_nested_last 29.4020μs 9.7989μs 102.0519 KOps/s 321.5399 KOps/s $\textbf{\color{#d91a1a}-68.26\%}$
test_membership_stacked_nested_leaf_last 42.2720μs 9.7535μs 102.5271 KOps/s 319.4905 KOps/s $\textbf{\color{#d91a1a}-67.91\%}$
test_nested_getleaf 34.2920μs 8.4404μs 118.4779 KOps/s 118.0063 KOps/s $\color{#35bf28}+0.40\%$
test_nested_get 24.2620μs 7.9014μs 126.5592 KOps/s 125.8548 KOps/s $\color{#35bf28}+0.56\%$
test_stacked_getleaf 36.1720μs 8.3801μs 119.3304 KOps/s 117.0414 KOps/s $\color{#35bf28}+1.96\%$
test_stacked_get 71.8150μs 7.8778μs 126.9386 KOps/s 125.0924 KOps/s $\color{#35bf28}+1.48\%$
test_nested_getitemleaf 23.8220μs 8.5698μs 116.6888 KOps/s 115.4484 KOps/s $\color{#35bf28}+1.07\%$
test_nested_getitem 40.5530μs 8.0730μs 123.8704 KOps/s 123.4344 KOps/s $\color{#35bf28}+0.35\%$
test_stacked_getitemleaf 31.0420μs 8.6122μs 116.1141 KOps/s 115.0032 KOps/s $\color{#35bf28}+0.97\%$
test_stacked_getitem 37.4120μs 8.0614μs 124.0481 KOps/s 122.5898 KOps/s $\color{#35bf28}+1.19\%$
test_lock_nested 57.9551ms 0.4266ms 2.3443 KOps/s 2.3691 KOps/s $\color{#d91a1a}-1.04\%$
test_lock_stack_nested 0.3554ms 0.3140ms 3.1848 KOps/s 3.0965 KOps/s $\color{#35bf28}+2.85\%$
test_unlock_nested 59.9135ms 0.4308ms 2.3212 KOps/s 2.3324 KOps/s $\color{#d91a1a}-0.48\%$
test_unlock_stack_nested 0.3670ms 0.3221ms 3.1046 KOps/s 3.0290 KOps/s $\color{#35bf28}+2.49\%$
test_flatten_speed 0.3589ms 0.1011ms 9.8874 KOps/s 9.7385 KOps/s $\color{#35bf28}+1.53\%$
test_unflatten_speed 0.3136ms 0.2921ms 3.4230 KOps/s 3.3520 KOps/s $\color{#35bf28}+2.12\%$
test_common_ops 1.0430ms 0.6083ms 1.6440 KOps/s 1.6598 KOps/s $\color{#d91a1a}-0.95\%$
test_creation 36.6420μs 1.6277μs 614.3704 KOps/s 604.3117 KOps/s $\color{#35bf28}+1.66\%$
test_creation_empty 22.5510μs 8.0118μs 124.8163 KOps/s 131.6774 KOps/s $\textbf{\color{#d91a1a}-5.21\%}$
test_creation_nested_1 30.9410μs 9.8230μs 101.8020 KOps/s 105.9375 KOps/s $\color{#d91a1a}-3.90\%$
test_creation_nested_2 40.7830μs 12.0776μs 82.7981 KOps/s 85.1690 KOps/s $\color{#d91a1a}-2.78\%$
test_clone 72.6250μs 12.8119μs 78.0522 KOps/s 77.0793 KOps/s $\color{#35bf28}+1.26\%$
test_getitem[int] 31.5710μs 11.7741μs 84.9319 KOps/s 86.0894 KOps/s $\color{#d91a1a}-1.34\%$
test_getitem[slice_int] 39.9320μs 22.8013μs 43.8571 KOps/s 44.7274 KOps/s $\color{#d91a1a}-1.95\%$
test_getitem[range] 68.7840μs 50.2010μs 19.9199 KOps/s 19.9916 KOps/s $\color{#d91a1a}-0.36\%$
test_getitem[tuple] 40.7930μs 20.3629μs 49.1088 KOps/s 49.5470 KOps/s $\color{#d91a1a}-0.88\%$
test_getitem[list] 0.1195ms 37.4132μs 26.7285 KOps/s 27.0098 KOps/s $\color{#d91a1a}-1.04\%$
test_setitem_dim[int] 47.0130μs 29.8616μs 33.4878 KOps/s 32.6655 KOps/s $\color{#35bf28}+2.52\%$
test_setitem_dim[slice_int] 70.6140μs 51.8846μs 19.2735 KOps/s 19.4139 KOps/s $\color{#d91a1a}-0.72\%$
test_setitem_dim[range] 90.9350μs 70.2942μs 14.2259 KOps/s 14.0593 KOps/s $\color{#35bf28}+1.19\%$
test_setitem_dim[tuple] 62.8540μs 45.3573μs 22.0472 KOps/s 22.1154 KOps/s $\color{#d91a1a}-0.31\%$
test_setitem 53.4530μs 17.6337μs 56.7096 KOps/s 57.9319 KOps/s $\color{#d91a1a}-2.11\%$
test_set 53.6430μs 17.1996μs 58.1410 KOps/s 59.5321 KOps/s $\color{#d91a1a}-2.34\%$
test_set_shared 1.4848ms 0.1055ms 9.4747 KOps/s 9.6918 KOps/s $\color{#d91a1a}-2.24\%$
test_update 73.4240μs 19.2919μs 51.8351 KOps/s 53.9651 KOps/s $\color{#d91a1a}-3.95\%$
test_update_nested 58.5340μs 24.9572μs 40.0686 KOps/s 40.7465 KOps/s $\color{#d91a1a}-1.66\%$
test_update__nested 58.2130μs 24.0739μs 41.5387 KOps/s 41.3524 KOps/s $\color{#35bf28}+0.45\%$
test_set_nested 66.3240μs 18.3418μs 54.5203 KOps/s 56.7872 KOps/s $\color{#d91a1a}-3.99\%$
test_set_nested_new 66.9840μs 21.3797μs 46.7734 KOps/s 48.1040 KOps/s $\color{#d91a1a}-2.77\%$
test_select 65.1840μs 34.5870μs 28.9126 KOps/s 29.5249 KOps/s $\color{#d91a1a}-2.07\%$
test_select_nested 0.9658ms 54.4585μs 18.3626 KOps/s 18.1945 KOps/s $\color{#35bf28}+0.92\%$
test_exclude_nested 0.1405ms 0.1117ms 8.9536 KOps/s 9.0218 KOps/s $\color{#d91a1a}-0.76\%$
test_empty[True] 0.4069ms 0.3492ms 2.8641 KOps/s 2.8451 KOps/s $\color{#35bf28}+0.67\%$
test_empty[False] 2.7232μs 0.9283μs 1.0772 MOps/s 1.0726 MOps/s $\color{#35bf28}+0.43\%$
test_to 0.1039ms 80.3559μs 12.4446 KOps/s 12.8205 KOps/s $\color{#d91a1a}-2.93\%$
test_to_nonblocking 0.1010ms 64.6449μs 15.4691 KOps/s 15.4837 KOps/s $\color{#d91a1a}-0.09\%$
test_unbind_speed 0.3190ms 0.2865ms 3.4909 KOps/s 3.5250 KOps/s $\color{#d91a1a}-0.97\%$
test_unbind_speed_stack0 0.3435ms 0.2812ms 3.5568 KOps/s 3.4566 KOps/s $\color{#35bf28}+2.90\%$
test_unbind_speed_stack1 76.1029ms 0.8571ms 1.1667 KOps/s 1.1561 KOps/s $\color{#35bf28}+0.92\%$
test_split 1.7880ms 1.7451ms 573.0326 Ops/s 543.6315 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_chunk 76.9690ms 1.8760ms 533.0376 Ops/s 584.4175 Ops/s $\textbf{\color{#d91a1a}-8.79\%}$
test_creation[device0] 0.1404ms 60.4998μs 16.5290 KOps/s 16.5984 KOps/s $\color{#d91a1a}-0.42\%$
test_creation_from_tensor 0.1320ms 55.1561μs 18.1304 KOps/s 18.1454 KOps/s $\color{#d91a1a}-0.08\%$
test_add_one[memmap_tensor0] 91.3260μs 7.8259μs 127.7812 KOps/s 133.3366 KOps/s $\color{#d91a1a}-4.17\%$
test_contiguous[memmap_tensor0] 13.8710μs 0.6930μs 1.4430 MOps/s 1.4918 MOps/s $\color{#d91a1a}-3.27\%$
test_stack[memmap_tensor0] 30.9020μs 5.4244μs 184.3538 KOps/s 196.1426 KOps/s $\textbf{\color{#d91a1a}-6.01\%}$
test_memmaptd_index 1.0785ms 0.3130ms 3.1945 KOps/s 3.2570 KOps/s $\color{#d91a1a}-1.92\%$
test_memmaptd_index_astensor 0.7372ms 0.3854ms 2.5949 KOps/s 2.6404 KOps/s $\color{#d91a1a}-1.72\%$
test_memmaptd_index_op 1.1217ms 0.7082ms 1.4120 KOps/s 1.4658 KOps/s $\color{#d91a1a}-3.67\%$
test_serialize_model 0.1848s 0.1112s 8.9946 Ops/s 8.6716 Ops/s $\color{#35bf28}+3.72\%$
test_serialize_model_pickle 1.3494s 1.2354s 0.8094 Ops/s 0.8082 Ops/s $\color{#35bf28}+0.15\%$
test_serialize_weights 0.1847s 0.1090s 9.1744 Ops/s 9.5547 Ops/s $\color{#d91a1a}-3.98\%$
test_serialize_weights_returnearly 0.1855s 94.9300ms 10.5341 Ops/s 10.7621 Ops/s $\color{#d91a1a}-2.12\%$
test_serialize_weights_pickle 1.3995s 1.2544s 0.7972 Ops/s 0.7969 Ops/s $\color{#35bf28}+0.03\%$
test_reshape_pytree 57.0430μs 27.1652μs 36.8118 KOps/s 36.9734 KOps/s $\color{#d91a1a}-0.44\%$
test_reshape_td 69.7650μs 32.3004μs 30.9594 KOps/s 29.1277 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_view_pytree 55.2630μs 26.5316μs 37.6909 KOps/s 36.4579 KOps/s $\color{#35bf28}+3.38\%$
test_view_td 66.0140μs 36.2118μs 27.6153 KOps/s 26.6664 KOps/s $\color{#35bf28}+3.56\%$
test_unbind_pytree 64.8140μs 35.1546μs 28.4458 KOps/s 28.8164 KOps/s $\color{#d91a1a}-1.29\%$
test_unbind_td 0.4250ms 43.9189μs 22.7692 KOps/s 23.0873 KOps/s $\color{#d91a1a}-1.38\%$
test_split_pytree 70.2440μs 39.1040μs 25.5729 KOps/s 27.3688 KOps/s $\textbf{\color{#d91a1a}-6.56\%}$
test_split_td 0.4397ms 45.6961μs 21.8837 KOps/s 23.7628 KOps/s $\textbf{\color{#d91a1a}-7.91\%}$
test_add_pytree 86.5450μs 41.5006μs 24.0961 KOps/s 24.7435 KOps/s $\color{#d91a1a}-2.62\%$
test_add_td 93.7460μs 52.2719μs 19.1307 KOps/s 18.4182 KOps/s $\color{#35bf28}+3.87\%$
test_distributed 0.1836ms 70.0558μs 14.2743 KOps/s 14.5078 KOps/s $\color{#d91a1a}-1.61\%$
test_tdmodule 24.0920μs 14.5755μs 68.6085 KOps/s 69.4796 KOps/s $\color{#d91a1a}-1.25\%$
test_tdmodule_dispatch 50.8130μs 29.1118μs 34.3503 KOps/s 33.9787 KOps/s $\color{#35bf28}+1.09\%$
test_tdseq 34.5320μs 16.6195μs 60.1704 KOps/s 57.9037 KOps/s $\color{#35bf28}+3.91\%$
test_tdseq_dispatch 49.0830μs 31.8315μs 31.4154 KOps/s 30.0600 KOps/s $\color{#35bf28}+4.51\%$
test_instantiation_functorch 78.4400ms 1.7418ms 574.1219 Ops/s 628.5091 Ops/s $\textbf{\color{#d91a1a}-8.65\%}$
test_instantiation_td 1.6348ms 1.0715ms 933.2720 Ops/s 848.4641 Ops/s $\textbf{\color{#35bf28}+10.00\%}$
test_exec_functorch 0.1906ms 0.1620ms 6.1719 KOps/s 6.1755 KOps/s $\color{#d91a1a}-0.06\%$
test_exec_functional_call 0.2047ms 0.1485ms 6.7363 KOps/s 6.6017 KOps/s $\color{#35bf28}+2.04\%$
test_exec_td 0.2007ms 0.1469ms 6.8079 KOps/s 6.3840 KOps/s $\textbf{\color{#35bf28}+6.64\%}$
test_exec_td_decorator 0.5745ms 0.2225ms 4.4945 KOps/s 4.5038 KOps/s $\color{#d91a1a}-0.21\%$
test_vmap_mlp_speed[True-True] 0.8164ms 0.6328ms 1.5802 KOps/s 1.6409 KOps/s $\color{#d91a1a}-3.70\%$
test_vmap_mlp_speed[True-False] 0.6723ms 0.6059ms 1.6505 KOps/s 1.6286 KOps/s $\color{#35bf28}+1.35\%$
test_vmap_mlp_speed[False-True] 0.6088ms 0.5414ms 1.8469 KOps/s 1.7655 KOps/s $\color{#35bf28}+4.61\%$
test_vmap_mlp_speed[False-False] 0.5698ms 0.5409ms 1.8489 KOps/s 1.7644 KOps/s $\color{#35bf28}+4.79\%$
test_vmap_mlp_speed_decorator[True-True] 0.8000ms 0.6702ms 1.4921 KOps/s 1.4206 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_vmap_mlp_speed_decorator[True-False] 0.9495ms 0.6694ms 1.4939 KOps/s 1.4169 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_vmap_mlp_speed_decorator[False-True] 0.7044ms 0.5974ms 1.6739 KOps/s 1.5949 KOps/s $\color{#35bf28}+4.96\%$
test_vmap_mlp_speed_decorator[False-False] 0.7112ms 0.5952ms 1.6800 KOps/s 1.5922 KOps/s $\textbf{\color{#35bf28}+5.51\%}$
test_vmap_transformer_speed[True-True] 8.2336ms 8.1266ms 123.0519 Ops/s 120.9128 Ops/s $\color{#35bf28}+1.77\%$
test_vmap_transformer_speed[True-False] 8.1602ms 8.1093ms 123.3159 Ops/s 121.8925 Ops/s $\color{#35bf28}+1.17\%$
test_vmap_transformer_speed[False-True] 8.2777ms 8.0605ms 124.0618 Ops/s 123.1013 Ops/s $\color{#35bf28}+0.78\%$
test_vmap_transformer_speed[False-False] 8.4172ms 8.0495ms 124.2308 Ops/s 121.0656 Ops/s $\color{#35bf28}+2.61\%$
test_vmap_transformer_speed_decorator[True-True] 20.2141ms 19.6738ms 50.8291 Ops/s 50.6731 Ops/s $\color{#35bf28}+0.31\%$
test_vmap_transformer_speed_decorator[True-False] 20.7938ms 19.6856ms 50.7986 Ops/s 50.5329 Ops/s $\color{#35bf28}+0.53\%$
test_vmap_transformer_speed_decorator[False-True] 19.6082ms 19.5051ms 51.2686 Ops/s 50.3202 Ops/s $\color{#35bf28}+1.88\%$
test_vmap_transformer_speed_decorator[False-False] 19.6342ms 19.5514ms 51.1474 Ops/s 50.9094 Ops/s $\color{#35bf28}+0.47\%$
test_to_module_speed[True] 2.8853ms 1.5443ms 647.5364 Ops/s 641.5611 Ops/s $\color{#35bf28}+0.93\%$
test_to_module_speed[False] 1.9507ms 1.5238ms 656.2361 Ops/s 645.8572 Ops/s $\color{#35bf28}+1.61\%$
test_tc_init 50.1930μs 23.4708μs 42.6062 KOps/s 44.4418 KOps/s $\color{#d91a1a}-4.13\%$
test_tc_init_nested 74.3940μs 47.2722μs 21.1541 KOps/s 22.8199 KOps/s $\textbf{\color{#d91a1a}-7.30\%}$
test_tc_first_layer_tensor 3.0839μs 0.3623μs 2.7602 MOps/s 2.7639 MOps/s $\color{#d91a1a}-0.14\%$
test_tc_first_layer_nontensor 1.6324μs 0.3921μs 2.5506 MOps/s 2.5902 MOps/s $\color{#d91a1a}-1.53\%$
test_tc_second_layer_tensor 5.8544μs 0.9597μs 1.0420 MOps/s 933.6102 KOps/s $\textbf{\color{#35bf28}+11.60\%}$
test_tc_second_layer_nontensor 2.2861μs 0.8132μs 1.2297 MOps/s 1.1932 MOps/s $\color{#35bf28}+3.06\%$
test_unbind 0.1099s 7.0844ms 141.1546 Ops/s 114.6822 Ops/s $\textbf{\color{#35bf28}+23.08\%}$
test_full_like 13.3693ms 12.9864ms 77.0038 Ops/s 86.3611 Ops/s $\textbf{\color{#d91a1a}-10.84\%}$
test_zeros_like 8.0148ms 7.8580ms 127.2589 Ops/s 127.7247 Ops/s $\color{#d91a1a}-0.36\%$
test_ones_like 8.1169ms 7.8687ms 127.0853 Ops/s 128.3857 Ops/s $\color{#d91a1a}-1.01\%$
test_clone 9.9889ms 9.4542ms 105.7730 Ops/s 107.4047 Ops/s $\color{#d91a1a}-1.52\%$
test_squeeze 81.0250μs 11.0632μs 90.3900 KOps/s 89.2887 KOps/s $\color{#35bf28}+1.23\%$
test_unsqueeze 97.1060μs 55.1935μs 18.1181 KOps/s 18.7852 KOps/s $\color{#d91a1a}-3.55\%$
test_split 0.1992ms 0.1046ms 9.5582 KOps/s 9.4710 KOps/s $\color{#35bf28}+0.92\%$
test_permute 0.1669ms 0.1201ms 8.3269 KOps/s 8.8124 KOps/s $\textbf{\color{#d91a1a}-5.51\%}$
test_stack 27.6109ms 27.0724ms 36.9380 Ops/s 37.1302 Ops/s $\color{#d91a1a}-0.52\%$
test_cat 27.4863ms 27.0015ms 37.0350 Ops/s 37.1421 Ops/s $\color{#d91a1a}-0.29\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants