Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] fix tensorclass set #854

Merged
merged 1 commit into from
Jul 5, 2024
Merged

[BugFix] fix tensorclass set #854

merged 1 commit into from
Jul 5, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jul 5, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 5, 2024
@vmoens vmoens added the bug Something isn't working label Jul 5, 2024
@vmoens vmoens merged commit 60d8a61 into main Jul 5, 2024
28 of 38 checks passed
@vmoens vmoens deleted the fix-tensorclass-set branch July 5, 2024 10:57
Copy link

github-actions bot commented Jul 5, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 38.9720μs 17.0447μs 58.6692 KOps/s 58.1698 KOps/s $\color{#35bf28}+0.86\%$
test_plain_set_stack_nested 41.1170μs 17.4423μs 57.3320 KOps/s 57.3406 KOps/s $\color{#d91a1a}-0.01\%$
test_plain_set_nested_inplace 54.5520μs 19.4819μs 51.3296 KOps/s 47.4723 KOps/s $\textbf{\color{#35bf28}+8.13\%}$
test_plain_set_stack_nested_inplace 51.2560μs 19.6777μs 50.8191 KOps/s 50.9982 KOps/s $\color{#d91a1a}-0.35\%$
test_items 25.4180μs 2.5610μs 390.4676 KOps/s 383.4313 KOps/s $\color{#35bf28}+1.84\%$
test_items_nested 0.7706ms 0.2909ms 3.4380 KOps/s 3.6096 KOps/s $\color{#d91a1a}-4.75\%$
test_items_nested_locked 0.7973ms 0.2898ms 3.4505 KOps/s 3.6189 KOps/s $\color{#d91a1a}-4.65\%$
test_items_nested_leaf 0.1401ms 79.9657μs 12.5054 KOps/s 12.3612 KOps/s $\color{#35bf28}+1.17\%$
test_items_stack_nested 0.4340ms 0.2906ms 3.4414 KOps/s 3.5347 KOps/s $\color{#d91a1a}-2.64\%$
test_items_stack_nested_leaf 0.1278ms 79.7727μs 12.5356 KOps/s 12.4673 KOps/s $\color{#35bf28}+0.55\%$
test_items_stack_nested_locked 0.9620ms 0.2862ms 3.4938 KOps/s 3.6324 KOps/s $\color{#d91a1a}-3.82\%$
test_keys 39.0630μs 3.8193μs 261.8296 KOps/s 263.5583 KOps/s $\color{#d91a1a}-0.66\%$
test_keys_nested 0.2419ms 0.1397ms 7.1557 KOps/s 7.1759 KOps/s $\color{#d91a1a}-0.28\%$
test_keys_nested_locked 0.7762ms 0.1442ms 6.9363 KOps/s 6.9062 KOps/s $\color{#35bf28}+0.44\%$
test_keys_nested_leaf 0.2004ms 0.1186ms 8.4286 KOps/s 8.4676 KOps/s $\color{#d91a1a}-0.46\%$
test_keys_stack_nested 0.6427ms 0.1437ms 6.9601 KOps/s 7.1835 KOps/s $\color{#d91a1a}-3.11\%$
test_keys_stack_nested_leaf 0.2069ms 0.1160ms 8.6224 KOps/s 8.4660 KOps/s $\color{#35bf28}+1.85\%$
test_keys_stack_nested_locked 0.2582ms 0.1415ms 7.0677 KOps/s 7.0006 KOps/s $\color{#35bf28}+0.96\%$
test_values 6.9570μs 1.1635μs 859.4545 KOps/s 860.7141 KOps/s $\color{#d91a1a}-0.15\%$
test_values_nested 0.1013ms 53.2745μs 18.7707 KOps/s 19.7069 KOps/s $\color{#d91a1a}-4.75\%$
test_values_nested_locked 0.3009ms 52.1465μs 19.1767 KOps/s 18.9451 KOps/s $\color{#35bf28}+1.22\%$
test_values_nested_leaf 87.3430μs 47.7826μs 20.9281 KOps/s 21.8136 KOps/s $\color{#d91a1a}-4.06\%$
test_values_stack_nested 0.1966ms 53.1679μs 18.8083 KOps/s 19.4933 KOps/s $\color{#d91a1a}-3.51\%$
test_values_stack_nested_leaf 91.8910μs 47.6639μs 20.9802 KOps/s 21.8074 KOps/s $\color{#d91a1a}-3.79\%$
test_values_stack_nested_locked 0.1368ms 53.6921μs 18.6247 KOps/s 19.2872 KOps/s $\color{#d91a1a}-3.43\%$
test_membership 14.2270μs 1.3736μs 727.9892 KOps/s 730.1718 KOps/s $\color{#d91a1a}-0.30\%$
test_membership_nested 38.1010μs 3.4091μs 293.3325 KOps/s 282.0785 KOps/s $\color{#35bf28}+3.99\%$
test_membership_nested_leaf 52.9680μs 3.4455μs 290.2361 KOps/s 284.0741 KOps/s $\color{#35bf28}+2.17\%$
test_membership_stacked_nested 19.2760μs 3.4068μs 293.5293 KOps/s 268.8203 KOps/s $\textbf{\color{#35bf28}+9.19\%}$
test_membership_stacked_nested_leaf 20.3380μs 3.4332μs 291.2693 KOps/s 285.6686 KOps/s $\color{#35bf28}+1.96\%$
test_membership_nested_last 23.6640μs 4.1864μs 238.8690 KOps/s 238.9223 KOps/s $\color{#d91a1a}-0.02\%$
test_membership_nested_leaf_last 33.1920μs 4.2033μs 237.9065 KOps/s 238.5266 KOps/s $\color{#d91a1a}-0.26\%$
test_membership_stacked_nested_last 26.8500μs 5.2906μs 189.0153 KOps/s 237.5619 KOps/s $\textbf{\color{#d91a1a}-20.44\%}$
test_membership_stacked_nested_leaf_last 30.0970μs 5.3702μs 186.2130 KOps/s 233.3752 KOps/s $\textbf{\color{#d91a1a}-20.21\%}$
test_nested_getleaf 34.5450μs 10.9552μs 91.2813 KOps/s 94.1394 KOps/s $\color{#d91a1a}-3.04\%$
test_nested_get 36.6590μs 10.4468μs 95.7234 KOps/s 98.5885 KOps/s $\color{#d91a1a}-2.91\%$
test_stacked_getleaf 29.9650μs 10.8043μs 92.5556 KOps/s 93.3781 KOps/s $\color{#d91a1a}-0.88\%$
test_stacked_get 41.5580μs 10.2327μs 97.7255 KOps/s 98.6858 KOps/s $\color{#d91a1a}-0.97\%$
test_nested_getitemleaf 33.3530μs 11.5929μs 86.2594 KOps/s 88.5812 KOps/s $\color{#d91a1a}-2.62\%$
test_nested_getitem 44.2530μs 10.9157μs 91.6112 KOps/s 95.7400 KOps/s $\color{#d91a1a}-4.31\%$
test_stacked_getitemleaf 34.5650μs 11.5044μs 86.9235 KOps/s 88.7281 KOps/s $\color{#d91a1a}-2.03\%$
test_stacked_getitem 31.5390μs 10.5677μs 94.6281 KOps/s 95.8396 KOps/s $\color{#d91a1a}-1.26\%$
test_lock_nested 52.0958ms 0.3868ms 2.5853 KOps/s 3.0003 KOps/s $\textbf{\color{#d91a1a}-13.83\%}$
test_lock_stack_nested 0.5746ms 0.2991ms 3.3436 KOps/s 3.3191 KOps/s $\color{#35bf28}+0.74\%$
test_unlock_nested 0.7785ms 0.3336ms 2.9972 KOps/s 2.9421 KOps/s $\color{#35bf28}+1.87\%$
test_unlock_stack_nested 0.4795ms 0.3054ms 3.2741 KOps/s 3.2022 KOps/s $\color{#35bf28}+2.25\%$
test_flatten_speed 0.2211ms 0.1004ms 9.9588 KOps/s 9.9914 KOps/s $\color{#d91a1a}-0.33\%$
test_unflatten_speed 0.7384ms 0.4194ms 2.3842 KOps/s 2.4100 KOps/s $\color{#d91a1a}-1.07\%$
test_common_ops 1.3975ms 0.7216ms 1.3859 KOps/s 1.3313 KOps/s $\color{#35bf28}+4.10\%$
test_creation 19.4770μs 1.9099μs 523.5746 KOps/s 514.9879 KOps/s $\color{#35bf28}+1.67\%$
test_creation_empty 35.1660μs 10.8906μs 91.8225 KOps/s 83.3974 KOps/s $\textbf{\color{#35bf28}+10.10\%}$
test_creation_nested_1 40.5760μs 13.5898μs 73.5845 KOps/s 67.1775 KOps/s $\textbf{\color{#35bf28}+9.54\%}$
test_creation_nested_2 71.0330μs 17.2119μs 58.0992 KOps/s 54.8957 KOps/s $\textbf{\color{#35bf28}+5.84\%}$
test_clone 71.4330μs 13.2412μs 75.5217 KOps/s 76.0746 KOps/s $\color{#d91a1a}-0.73\%$
test_getitem[int] 35.1260μs 11.0271μs 90.6854 KOps/s 89.9607 KOps/s $\color{#35bf28}+0.81\%$
test_getitem[slice_int] 70.1810μs 22.0579μs 45.3353 KOps/s 44.7504 KOps/s $\color{#35bf28}+1.31\%$
test_getitem[range] 74.3690μs 58.5854μs 17.0691 KOps/s 16.7243 KOps/s $\color{#35bf28}+2.06\%$
test_getitem[tuple] 54.8620μs 18.2615μs 54.7599 KOps/s 54.9844 KOps/s $\color{#d91a1a}-0.41\%$
test_getitem[list] 0.1024ms 39.3588μs 25.4073 KOps/s 25.6704 KOps/s $\color{#d91a1a}-1.02\%$
test_setitem_dim[int] 62.3560μs 32.8905μs 30.4039 KOps/s 29.4046 KOps/s $\color{#35bf28}+3.40\%$
test_setitem_dim[slice_int] 99.7960μs 59.0009μs 16.9489 KOps/s 16.8951 KOps/s $\color{#35bf28}+0.32\%$
test_setitem_dim[range] 0.1707ms 82.1899μs 12.1669 KOps/s 11.8298 KOps/s $\color{#35bf28}+2.85\%$
test_setitem_dim[tuple] 0.1000ms 48.5660μs 20.5905 KOps/s 20.1249 KOps/s $\color{#35bf28}+2.31\%$
test_setitem 57.5170μs 19.8832μs 50.2937 KOps/s 48.1999 KOps/s $\color{#35bf28}+4.34\%$
test_set 71.8240μs 19.7216μs 50.7059 KOps/s 49.5931 KOps/s $\color{#35bf28}+2.24\%$
test_set_shared 3.4628ms 0.1429ms 6.9966 KOps/s 6.7196 KOps/s $\color{#35bf28}+4.12\%$
test_update 0.1400ms 22.4098μs 44.6233 KOps/s 42.4553 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_update_nested 0.1338ms 30.7450μs 32.5256 KOps/s 31.2497 KOps/s $\color{#35bf28}+4.08\%$
test_update__nested 0.2606ms 26.4711μs 37.7770 KOps/s 40.1777 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_set_nested 69.2490μs 21.2201μs 47.1251 KOps/s 44.7199 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_set_nested_new 62.1560μs 25.5187μs 39.1869 KOps/s 37.3989 KOps/s $\color{#35bf28}+4.78\%$
test_select 0.1062ms 40.3864μs 24.7608 KOps/s 23.6177 KOps/s $\color{#35bf28}+4.84\%$
test_select_nested 0.1129ms 57.7906μs 17.3038 KOps/s 16.8453 KOps/s $\color{#35bf28}+2.72\%$
test_exclude_nested 0.2177ms 0.1196ms 8.3594 KOps/s 8.3873 KOps/s $\color{#d91a1a}-0.33\%$
test_empty[True] 0.6056ms 0.3960ms 2.5255 KOps/s 2.4829 KOps/s $\color{#35bf28}+1.72\%$
test_empty[False] 26.8220μs 1.0414μs 960.2482 KOps/s 896.2196 KOps/s $\textbf{\color{#35bf28}+7.14\%}$
test_unbind_speed 0.4550ms 0.2437ms 4.1029 KOps/s 4.0438 KOps/s $\color{#35bf28}+1.46\%$
test_unbind_speed_stack0 0.4989ms 0.2416ms 4.1387 KOps/s 4.0098 KOps/s $\color{#35bf28}+3.22\%$
test_unbind_speed_stack1 68.9810ms 0.6990ms 1.4306 KOps/s 1.4069 KOps/s $\color{#35bf28}+1.69\%$
test_split 69.7036ms 1.5842ms 631.2516 Ops/s 613.3085 Ops/s $\color{#35bf28}+2.93\%$
test_chunk 74.8310ms 1.5958ms 626.6275 Ops/s 633.7496 Ops/s $\color{#d91a1a}-1.12\%$
test_creation[device0] 0.2106ms 84.3371μs 11.8572 KOps/s 11.6566 KOps/s $\color{#35bf28}+1.72\%$
test_creation_from_tensor 4.3601ms 84.8855μs 11.7806 KOps/s 11.5103 KOps/s $\color{#35bf28}+2.35\%$
test_add_one[memmap_tensor0] 90.1590μs 5.3330μs 187.5119 KOps/s 184.1848 KOps/s $\color{#35bf28}+1.81\%$
test_contiguous[memmap_tensor0] 10.5500μs 0.6257μs 1.5982 MOps/s 1.5314 MOps/s $\color{#35bf28}+4.36\%$
test_stack[memmap_tensor0] 23.4440μs 3.5978μs 277.9483 KOps/s 276.7384 KOps/s $\color{#35bf28}+0.44\%$
test_memmaptd_index 1.0193ms 0.2595ms 3.8530 KOps/s 3.9011 KOps/s $\color{#d91a1a}-1.23\%$
test_memmaptd_index_astensor 0.7272ms 0.3323ms 3.0095 KOps/s 3.0511 KOps/s $\color{#d91a1a}-1.36\%$
test_memmaptd_index_op 1.9891ms 0.6541ms 1.5289 KOps/s 1.5705 KOps/s $\color{#d91a1a}-2.65\%$
test_serialize_model 0.1638s 0.1055s 9.4784 Ops/s 9.2376 Ops/s $\color{#35bf28}+2.61\%$
test_serialize_model_pickle 0.4631s 0.3799s 2.6322 Ops/s 2.6253 Ops/s $\color{#35bf28}+0.26\%$
test_serialize_weights 0.1689s 0.1030s 9.7117 Ops/s 9.5478 Ops/s $\color{#35bf28}+1.72\%$
test_serialize_weights_returnearly 0.1286s 0.1194s 8.3770 Ops/s 8.0524 Ops/s $\color{#35bf28}+4.03\%$
test_serialize_weights_pickle 0.9912s 0.5807s 1.7221 Ops/s 2.4465 Ops/s $\textbf{\color{#d91a1a}-29.61\%}$
test_serialize_weights_filesystem 0.1601s 96.4777ms 10.3651 Ops/s 9.7656 Ops/s $\textbf{\color{#35bf28}+6.14\%}$
test_serialize_model_filesystem 0.1022s 93.2461ms 10.7243 Ops/s 10.2944 Ops/s $\color{#35bf28}+4.18\%$
test_reshape_pytree 66.3040μs 25.5080μs 39.2033 KOps/s 39.1330 KOps/s $\color{#35bf28}+0.18\%$
test_reshape_td 91.1710μs 34.6162μs 28.8882 KOps/s 28.2958 KOps/s $\color{#35bf28}+2.09\%$
test_view_pytree 83.1350μs 25.8728μs 38.6506 KOps/s 39.5206 KOps/s $\color{#d91a1a}-2.20\%$
test_view_td 76.1120μs 39.5236μs 25.3014 KOps/s 24.9301 KOps/s $\color{#35bf28}+1.49\%$
test_unbind_pytree 61.3950μs 29.5998μs 33.7840 KOps/s 34.1545 KOps/s $\color{#d91a1a}-1.08\%$
test_unbind_td 0.3614ms 37.1584μs 26.9118 KOps/s 26.9653 KOps/s $\color{#d91a1a}-0.20\%$
test_split_pytree 63.4590μs 30.0277μs 33.3026 KOps/s 34.2618 KOps/s $\color{#d91a1a}-2.80\%$
test_split_td 0.1215ms 39.8362μs 25.1028 KOps/s 24.7639 KOps/s $\color{#35bf28}+1.37\%$
test_add_pytree 98.6140μs 35.1076μs 28.4839 KOps/s 28.4820 KOps/s $+0.01\%$
test_add_td 0.1843ms 53.8412μs 18.5732 KOps/s 17.1532 KOps/s $\textbf{\color{#35bf28}+8.28\%}$
test_distributed 0.2537ms 0.1002ms 9.9756 KOps/s 9.5142 KOps/s $\color{#35bf28}+4.85\%$
test_tdmodule 71.4430μs 18.3485μs 54.5004 KOps/s 53.5865 KOps/s $\color{#35bf28}+1.71\%$
test_tdmodule_dispatch 58.1780μs 36.1451μs 27.6663 KOps/s 26.6742 KOps/s $\color{#35bf28}+3.72\%$
test_tdseq 44.2530μs 21.2688μs 47.0171 KOps/s 45.1011 KOps/s $\color{#35bf28}+4.25\%$
test_tdseq_dispatch 77.7850μs 41.1176μs 24.3205 KOps/s 23.6373 KOps/s $\color{#35bf28}+2.89\%$
test_instantiation_functorch 2.3377ms 1.3526ms 739.2942 Ops/s 738.5745 Ops/s $\color{#35bf28}+0.10\%$
test_instantiation_td 66.9990ms 1.1000ms 909.0920 Ops/s 960.2242 Ops/s $\textbf{\color{#d91a1a}-5.33\%}$
test_exec_functorch 0.2344ms 0.1633ms 6.1255 KOps/s 6.0552 KOps/s $\color{#35bf28}+1.16\%$
test_exec_functional_call 0.2207ms 0.1496ms 6.6833 KOps/s 6.6826 KOps/s $\color{#35bf28}+0.01\%$
test_exec_td 0.2249ms 0.1482ms 6.7487 KOps/s 6.9591 KOps/s $\color{#d91a1a}-3.02\%$
test_exec_td_decorator 0.9084ms 0.2227ms 4.4896 KOps/s 4.4988 KOps/s $\color{#d91a1a}-0.20\%$
test_vmap_mlp_speed[True-True] 0.6921ms 0.4901ms 2.0405 KOps/s 2.0459 KOps/s $\color{#d91a1a}-0.26\%$
test_vmap_mlp_speed[True-False] 0.8897ms 0.4891ms 2.0447 KOps/s 2.0507 KOps/s $\color{#d91a1a}-0.29\%$
test_vmap_mlp_speed[False-True] 0.6170ms 0.3972ms 2.5175 KOps/s 2.5332 KOps/s $\color{#d91a1a}-0.62\%$
test_vmap_mlp_speed[False-False] 0.6951ms 0.3980ms 2.5125 KOps/s 2.5271 KOps/s $\color{#d91a1a}-0.58\%$
test_vmap_mlp_speed_decorator[True-True] 1.0930ms 0.5624ms 1.7781 KOps/s 1.7736 KOps/s $\color{#35bf28}+0.26\%$
test_vmap_mlp_speed_decorator[True-False] 0.7645ms 0.5576ms 1.7935 KOps/s 1.7806 KOps/s $\color{#35bf28}+0.72\%$
test_vmap_mlp_speed_decorator[False-True] 0.7533ms 0.4619ms 2.1651 KOps/s 2.1744 KOps/s $\color{#d91a1a}-0.43\%$
test_vmap_mlp_speed_decorator[False-False] 0.8670ms 0.4626ms 2.1617 KOps/s 2.1655 KOps/s $\color{#d91a1a}-0.18\%$
test_to_module_speed[True] 2.5822ms 1.7208ms 581.1112 Ops/s 587.1179 Ops/s $\color{#d91a1a}-1.02\%$
test_to_module_speed[False] 2.6279ms 1.7060ms 586.1726 Ops/s 600.3701 Ops/s $\color{#d91a1a}-2.36\%$
test_tc_init 0.1287ms 60.0342μs 16.6572 KOps/s 16.6397 KOps/s $\color{#35bf28}+0.11\%$
test_tc_init_nested 0.2336ms 0.1199ms 8.3380 KOps/s 8.8342 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_tc_first_layer_tensor 24.7260μs 8.4054μs 118.9715 KOps/s 120.5181 KOps/s $\color{#d91a1a}-1.28\%$
test_tc_first_layer_nontensor 31.0980μs 8.4087μs 118.9249 KOps/s 121.1239 KOps/s $\color{#d91a1a}-1.82\%$
test_tc_second_layer_tensor 23.0930μs 2.5496μs 392.2120 KOps/s 395.3542 KOps/s $\color{#d91a1a}-0.79\%$
test_tc_second_layer_nontensor 30.0060μs 9.3936μs 106.4549 KOps/s 108.0046 KOps/s $\color{#d91a1a}-1.43\%$
test_unbind 81.3385ms 13.7881ms 72.5266 Ops/s 67.6035 Ops/s $\textbf{\color{#35bf28}+7.28\%}$
test_full_like 8.3844ms 7.1992ms 138.9050 Ops/s 93.1946 Ops/s $\textbf{\color{#35bf28}+49.05\%}$
test_zeros_like 13.8219ms 5.9243ms 168.7954 Ops/s 164.1255 Ops/s $\color{#35bf28}+2.85\%$
test_ones_like 16.0694ms 6.4281ms 155.5674 Ops/s 157.9737 Ops/s $\color{#d91a1a}-1.52\%$
test_clone 14.2716ms 8.0666ms 123.9686 Ops/s 122.5354 Ops/s $\color{#35bf28}+1.17\%$
test_squeeze 0.2761ms 13.8450μs 72.2280 KOps/s 77.9391 KOps/s $\textbf{\color{#d91a1a}-7.33\%}$
test_unsqueeze 0.2504ms 95.5094μs 10.4702 KOps/s 10.1880 KOps/s $\color{#35bf28}+2.77\%$
test_split 0.5166ms 0.2734ms 3.6571 KOps/s 3.6217 KOps/s $\color{#35bf28}+0.98\%$
test_permute 0.3702ms 0.2242ms 4.4603 KOps/s 4.4240 KOps/s $\color{#35bf28}+0.82\%$
test_stack 26.3968ms 22.5324ms 44.3805 Ops/s 41.5287 Ops/s $\textbf{\color{#35bf28}+6.87\%}$
test_cat 29.2164ms 22.3303ms 44.7821 Ops/s 42.9115 Ops/s $\color{#35bf28}+4.36\%$

Copy link

github-actions bot commented Jul 5, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}22$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 63.9910μs 12.5910μs 79.4221 KOps/s 82.4656 KOps/s $\color{#d91a1a}-3.69\%$
test_plain_set_stack_nested 26.1810μs 12.6750μs 78.8952 KOps/s 81.6596 KOps/s $\color{#d91a1a}-3.39\%$
test_plain_set_nested_inplace 37.7300μs 14.0143μs 71.3556 KOps/s 74.1271 KOps/s $\color{#d91a1a}-3.74\%$
test_plain_set_stack_nested_inplace 47.1910μs 13.9185μs 71.8468 KOps/s 74.1853 KOps/s $\color{#d91a1a}-3.15\%$
test_items 19.5400μs 4.6530μs 214.9154 KOps/s 215.9325 KOps/s $\color{#d91a1a}-0.47\%$
test_items_nested 0.3940ms 0.3451ms 2.8978 KOps/s 2.9450 KOps/s $\color{#d91a1a}-1.60\%$
test_items_nested_locked 0.4092ms 0.3525ms 2.8368 KOps/s 2.9449 KOps/s $\color{#d91a1a}-3.67\%$
test_items_nested_leaf 0.1025ms 82.7080μs 12.0907 KOps/s 12.1252 KOps/s $\color{#d91a1a}-0.28\%$
test_items_stack_nested 0.4099ms 0.3449ms 2.8995 KOps/s 2.9241 KOps/s $\color{#d91a1a}-0.84\%$
test_items_stack_nested_leaf 0.1048ms 83.8444μs 11.9269 KOps/s 12.0234 KOps/s $\color{#d91a1a}-0.80\%$
test_items_stack_nested_locked 0.3917ms 0.3489ms 2.8664 KOps/s 2.8895 KOps/s $\color{#d91a1a}-0.80\%$
test_keys 30.2010μs 4.3455μs 230.1227 KOps/s 230.7210 KOps/s $\color{#d91a1a}-0.26\%$
test_keys_nested 0.1015ms 68.7072μs 14.5545 KOps/s 14.9311 KOps/s $\color{#d91a1a}-2.52\%$
test_keys_nested_locked 2.3367ms 75.0424μs 13.3258 KOps/s 13.2566 KOps/s $\color{#35bf28}+0.52\%$
test_keys_nested_leaf 81.8620μs 57.5921μs 17.3635 KOps/s 17.2624 KOps/s $\color{#35bf28}+0.59\%$
test_keys_stack_nested 93.5820μs 66.6426μs 15.0054 KOps/s 14.5094 KOps/s $\color{#35bf28}+3.42\%$
test_keys_stack_nested_leaf 86.1520μs 59.1943μs 16.8935 KOps/s 17.3240 KOps/s $\color{#d91a1a}-2.49\%$
test_keys_stack_nested_locked 0.1047ms 73.9630μs 13.5203 KOps/s 13.3741 KOps/s $\color{#35bf28}+1.09\%$
test_values 7.9567μs 1.7974μs 556.3583 KOps/s 550.8175 KOps/s $\color{#35bf28}+1.01\%$
test_values_nested 59.6910μs 35.1817μs 28.4239 KOps/s 28.5698 KOps/s $\color{#d91a1a}-0.51\%$
test_values_nested_locked 65.6520μs 37.2050μs 26.8781 KOps/s 27.3173 KOps/s $\color{#d91a1a}-1.61\%$
test_values_nested_leaf 50.6810μs 31.0936μs 32.1610 KOps/s 32.1442 KOps/s $\color{#35bf28}+0.05\%$
test_values_stack_nested 52.8810μs 35.2901μs 28.3366 KOps/s 27.9001 KOps/s $\color{#35bf28}+1.56\%$
test_values_stack_nested_leaf 58.8520μs 31.2213μs 32.0295 KOps/s 31.0942 KOps/s $\color{#35bf28}+3.01\%$
test_values_stack_nested_locked 59.2210μs 36.8471μs 27.1392 KOps/s 26.9305 KOps/s $\color{#35bf28}+0.78\%$
test_membership 1.6830μs 0.7077μs 1.4130 MOps/s 1.4404 MOps/s $\color{#d91a1a}-1.90\%$
test_membership_nested 17.7700μs 2.5401μs 393.6844 KOps/s 396.0719 KOps/s $\color{#d91a1a}-0.60\%$
test_membership_nested_leaf 28.6500μs 2.5313μs 395.0545 KOps/s 391.8683 KOps/s $\color{#35bf28}+0.81\%$
test_membership_stacked_nested 25.8500μs 2.5447μs 392.9786 KOps/s 395.6302 KOps/s $\color{#d91a1a}-0.67\%$
test_membership_stacked_nested_leaf 33.7110μs 2.5339μs 394.6463 KOps/s 396.4680 KOps/s $\color{#d91a1a}-0.46\%$
test_membership_nested_last 18.3110μs 3.0396μs 328.9891 KOps/s 329.8977 KOps/s $\color{#d91a1a}-0.28\%$
test_membership_nested_leaf_last 32.0110μs 3.0422μs 328.7124 KOps/s 327.8787 KOps/s $\color{#35bf28}+0.25\%$
test_membership_stacked_nested_last 27.2510μs 3.0817μs 324.5015 KOps/s 264.3870 KOps/s $\textbf{\color{#35bf28}+22.74\%}$
test_membership_stacked_nested_leaf_last 21.9710μs 3.0295μs 330.0843 KOps/s 264.1538 KOps/s $\textbf{\color{#35bf28}+24.96\%}$
test_nested_getleaf 37.7610μs 8.3291μs 120.0607 KOps/s 119.9511 KOps/s $\color{#35bf28}+0.09\%$
test_nested_get 31.6110μs 7.7984μs 128.2314 KOps/s 128.2762 KOps/s $\color{#d91a1a}-0.03\%$
test_stacked_getleaf 25.8410μs 8.3154μs 120.2581 KOps/s 119.2676 KOps/s $\color{#35bf28}+0.83\%$
test_stacked_get 40.4110μs 7.8007μs 128.1936 KOps/s 128.2982 KOps/s $\color{#d91a1a}-0.08\%$
test_nested_getitemleaf 24.8910μs 8.4888μs 117.8020 KOps/s 117.8398 KOps/s $\color{#d91a1a}-0.03\%$
test_nested_getitem 35.0010μs 7.9831μs 125.2640 KOps/s 122.8762 KOps/s $\color{#35bf28}+1.94\%$
test_stacked_getitemleaf 35.1610μs 8.4482μs 118.3681 KOps/s 116.9447 KOps/s $\color{#35bf28}+1.22\%$
test_stacked_getitem 88.2120μs 8.0451μs 124.2999 KOps/s 125.1604 KOps/s $\color{#d91a1a}-0.69\%$
test_lock_nested 59.1127ms 0.3970ms 2.5187 KOps/s 2.4350 KOps/s $\color{#35bf28}+3.44\%$
test_lock_stack_nested 0.3392ms 0.2918ms 3.4268 KOps/s 3.2645 KOps/s $\color{#35bf28}+4.97\%$
test_unlock_nested 61.2117ms 0.3983ms 2.5108 KOps/s 2.4473 KOps/s $\color{#35bf28}+2.59\%$
test_unlock_stack_nested 0.3541ms 0.3007ms 3.3257 KOps/s 3.1983 KOps/s $\color{#35bf28}+3.98\%$
test_flatten_speed 0.4185ms 0.1011ms 9.8896 KOps/s 9.8063 KOps/s $\color{#35bf28}+0.85\%$
test_unflatten_speed 0.3286ms 0.2884ms 3.4680 KOps/s 3.4614 KOps/s $\color{#35bf28}+0.19\%$
test_common_ops 1.0769ms 0.5819ms 1.7184 KOps/s 1.7053 KOps/s $\color{#35bf28}+0.77\%$
test_creation 37.8400μs 1.6033μs 623.7050 KOps/s 622.8989 KOps/s $\color{#35bf28}+0.13\%$
test_creation_empty 26.6510μs 8.0037μs 124.9428 KOps/s 135.4387 KOps/s $\textbf{\color{#d91a1a}-7.75\%}$
test_creation_nested_1 24.3510μs 9.6641μs 103.4755 KOps/s 110.0819 KOps/s $\textbf{\color{#d91a1a}-6.00\%}$
test_creation_nested_2 36.7310μs 11.9565μs 83.6364 KOps/s 87.2567 KOps/s $\color{#d91a1a}-4.15\%$
test_clone 85.8820μs 11.7454μs 85.1396 KOps/s 82.9679 KOps/s $\color{#35bf28}+2.62\%$
test_getitem[int] 24.9500μs 10.4967μs 95.2680 KOps/s 90.7962 KOps/s $\color{#35bf28}+4.93\%$
test_getitem[slice_int] 40.0910μs 20.2660μs 49.3436 KOps/s 46.0215 KOps/s $\textbf{\color{#35bf28}+7.22\%}$
test_getitem[range] 65.9110μs 51.5730μs 19.3900 KOps/s 19.8820 KOps/s $\color{#d91a1a}-2.47\%$
test_getitem[tuple] 42.1110μs 18.4509μs 54.1979 KOps/s 51.3237 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_getitem[list] 0.1408ms 33.6312μs 29.7343 KOps/s 27.9918 KOps/s $\textbf{\color{#35bf28}+6.23\%}$
test_setitem_dim[int] 42.9310μs 25.0589μs 39.9059 KOps/s 36.7055 KOps/s $\textbf{\color{#35bf28}+8.72\%}$
test_setitem_dim[slice_int] 83.0010μs 49.4597μs 20.2185 KOps/s 20.2906 KOps/s $\color{#d91a1a}-0.36\%$
test_setitem_dim[range] 0.1085ms 67.2715μs 14.8651 KOps/s 14.8827 KOps/s $\color{#d91a1a}-0.12\%$
test_setitem_dim[tuple] 68.8320μs 42.9492μs 23.2833 KOps/s 23.9208 KOps/s $\color{#d91a1a}-2.66\%$
test_setitem 41.7010μs 16.1810μs 61.8009 KOps/s 60.9318 KOps/s $\color{#35bf28}+1.43\%$
test_set 54.6310μs 15.3862μs 64.9933 KOps/s 63.9561 KOps/s $\color{#35bf28}+1.62\%$
test_set_shared 1.6114ms 98.3614μs 10.1666 KOps/s 9.9638 KOps/s $\color{#35bf28}+2.04\%$
test_update 88.0020μs 18.4172μs 54.2970 KOps/s 55.9382 KOps/s $\color{#d91a1a}-2.93\%$
test_update_nested 73.7620μs 23.9920μs 41.6806 KOps/s 43.8305 KOps/s $\color{#d91a1a}-4.91\%$
test_update__nested 48.7510μs 22.5376μs 44.3703 KOps/s 44.1766 KOps/s $\color{#35bf28}+0.44\%$
test_set_nested 58.7420μs 16.6052μs 60.2221 KOps/s 59.7460 KOps/s $\color{#35bf28}+0.80\%$
test_set_nested_new 55.0610μs 19.2031μs 52.0750 KOps/s 51.8730 KOps/s $\color{#35bf28}+0.39\%$
test_select 75.7120μs 32.7677μs 30.5179 KOps/s 32.2559 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_select_nested 94.3520μs 51.3518μs 19.4735 KOps/s 19.1715 KOps/s $\color{#35bf28}+1.58\%$
test_exclude_nested 0.1400ms 0.1063ms 9.4047 KOps/s 9.4639 KOps/s $\color{#d91a1a}-0.63\%$
test_empty[True] 0.4032ms 0.3414ms 2.9292 KOps/s 2.9336 KOps/s $\color{#d91a1a}-0.15\%$
test_empty[False] 2.8071μs 0.8034μs 1.2447 MOps/s 1.2281 MOps/s $\color{#35bf28}+1.36\%$
test_to 87.0220μs 59.5436μs 16.7944 KOps/s 15.6418 KOps/s $\textbf{\color{#35bf28}+7.37\%}$
test_to_nonblocking 54.5310μs 35.9932μs 27.7830 KOps/s 26.8919 KOps/s $\color{#35bf28}+3.31\%$
test_unbind_speed 1.5628ms 0.2577ms 3.8803 KOps/s 3.7134 KOps/s $\color{#35bf28}+4.50\%$
test_unbind_speed_stack0 0.3006ms 0.2572ms 3.8883 KOps/s 3.7401 KOps/s $\color{#35bf28}+3.96\%$
test_unbind_speed_stack1 75.9604ms 0.7757ms 1.2892 KOps/s 1.2673 KOps/s $\color{#35bf28}+1.73\%$
test_split 76.2275ms 1.6752ms 596.9602 Ops/s 557.5711 Ops/s $\textbf{\color{#35bf28}+7.06\%}$
test_chunk 76.4499ms 1.6754ms 596.8772 Ops/s 560.8092 Ops/s $\textbf{\color{#35bf28}+6.43\%}$
test_creation[device0] 0.1512ms 57.6869μs 17.3350 KOps/s 17.1260 KOps/s $\color{#35bf28}+1.22\%$
test_creation_from_tensor 0.1402ms 54.3924μs 18.3849 KOps/s 18.4538 KOps/s $\color{#d91a1a}-0.37\%$
test_add_one[memmap_tensor0] 88.0620μs 7.1799μs 139.2784 KOps/s 131.8245 KOps/s $\textbf{\color{#35bf28}+5.65\%}$
test_contiguous[memmap_tensor0] 9.8000μs 0.6698μs 1.4929 MOps/s 1.4958 MOps/s $\color{#d91a1a}-0.19\%$
test_stack[memmap_tensor0] 37.9210μs 4.8591μs 205.8006 KOps/s 186.7391 KOps/s $\textbf{\color{#35bf28}+10.21\%}$
test_memmaptd_index 1.0702ms 0.2742ms 3.6470 KOps/s 3.4516 KOps/s $\textbf{\color{#35bf28}+5.66\%}$
test_memmaptd_index_astensor 0.5895ms 0.3346ms 2.9885 KOps/s 2.8729 KOps/s $\color{#35bf28}+4.02\%$
test_memmaptd_index_op 0.9415ms 0.6384ms 1.5665 KOps/s 1.5346 KOps/s $\color{#35bf28}+2.08\%$
test_serialize_model 91.7647ms 89.6288ms 11.1571 Ops/s 10.3041 Ops/s $\textbf{\color{#35bf28}+8.28\%}$
test_serialize_model_pickle 1.3482s 1.2352s 0.8096 Ops/s 0.8088 Ops/s $\color{#35bf28}+0.10\%$
test_serialize_weights 92.8752ms 88.7525ms 11.2673 Ops/s 9.6683 Ops/s $\textbf{\color{#35bf28}+16.54\%}$
test_serialize_weights_returnearly 0.2577s 77.5536ms 12.8943 Ops/s 13.5190 Ops/s $\color{#d91a1a}-4.62\%$
test_serialize_weights_pickle 1.3492s 1.2362s 0.8089 Ops/s 0.8032 Ops/s $\color{#35bf28}+0.71\%$
test_reshape_pytree 87.2320μs 25.8396μs 38.7003 KOps/s 38.1500 KOps/s $\color{#35bf28}+1.44\%$
test_reshape_td 70.0420μs 31.6892μs 31.5564 KOps/s 31.8002 KOps/s $\color{#d91a1a}-0.77\%$
test_view_pytree 48.5620μs 25.9540μs 38.5297 KOps/s 38.0707 KOps/s $\color{#35bf28}+1.21\%$
test_view_td 0.1019ms 35.9725μs 27.7990 KOps/s 27.8874 KOps/s $\color{#d91a1a}-0.32\%$
test_unbind_pytree 53.7210μs 32.0993μs 31.1533 KOps/s 29.1284 KOps/s $\textbf{\color{#35bf28}+6.95\%}$
test_unbind_td 0.4150ms 39.0687μs 25.5960 KOps/s 23.5477 KOps/s $\textbf{\color{#35bf28}+8.70\%}$
test_split_pytree 69.4120μs 35.7561μs 27.9672 KOps/s 27.5663 KOps/s $\color{#35bf28}+1.45\%$
test_split_td 0.1072ms 39.2368μs 25.4862 KOps/s 24.2427 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_add_pytree 63.5820μs 37.8563μs 26.4157 KOps/s 25.1617 KOps/s $\color{#35bf28}+4.98\%$
test_add_td 0.2103ms 50.0275μs 19.9890 KOps/s 19.8271 KOps/s $\color{#35bf28}+0.82\%$
test_distributed 0.2363ms 72.4840μs 13.7962 KOps/s 14.4697 KOps/s $\color{#d91a1a}-4.65\%$
test_tdmodule 0.1140ms 15.7156μs 63.6310 KOps/s 69.2747 KOps/s $\textbf{\color{#d91a1a}-8.15\%}$
test_tdmodule_dispatch 0.1694ms 30.4384μs 32.8532 KOps/s 36.1721 KOps/s $\textbf{\color{#d91a1a}-9.18\%}$
test_tdseq 37.5710μs 16.6290μs 60.1359 KOps/s 63.2547 KOps/s $\color{#d91a1a}-4.93\%$
test_tdseq_dispatch 47.6010μs 31.5890μs 31.6566 KOps/s 32.4594 KOps/s $\color{#d91a1a}-2.47\%$
test_instantiation_functorch 1.4993ms 1.4059ms 711.2989 Ops/s 692.9957 Ops/s $\color{#35bf28}+2.64\%$
test_instantiation_td 1.4740ms 0.9779ms 1.0226 KOps/s 913.7464 Ops/s $\textbf{\color{#35bf28}+11.91\%}$
test_exec_functorch 0.2112ms 0.1467ms 6.8184 KOps/s 6.5408 KOps/s $\color{#35bf28}+4.24\%$
test_exec_functional_call 0.1913ms 0.1370ms 7.2983 KOps/s 6.8573 KOps/s $\textbf{\color{#35bf28}+6.43\%}$
test_exec_td 0.1703ms 0.1363ms 7.3384 KOps/s 6.8525 KOps/s $\textbf{\color{#35bf28}+7.09\%}$
test_exec_td_decorator 0.7009ms 0.2081ms 4.8055 KOps/s 4.7070 KOps/s $\color{#35bf28}+2.09\%$
test_vmap_mlp_speed[True-True] 0.6471ms 0.5795ms 1.7258 KOps/s 1.7174 KOps/s $\color{#35bf28}+0.49\%$
test_vmap_mlp_speed[True-False] 0.6618ms 0.5788ms 1.7276 KOps/s 1.6606 KOps/s $\color{#35bf28}+4.03\%$
test_vmap_mlp_speed[False-True] 0.5809ms 0.5092ms 1.9637 KOps/s 1.9305 KOps/s $\color{#35bf28}+1.72\%$
test_vmap_mlp_speed[False-False] 0.5717ms 0.5108ms 1.9579 KOps/s 1.9157 KOps/s $\color{#35bf28}+2.20\%$
test_vmap_mlp_speed_decorator[True-True] 0.7964ms 0.6386ms 1.5659 KOps/s 1.5503 KOps/s $\color{#35bf28}+1.00\%$
test_vmap_mlp_speed_decorator[True-False] 0.8239ms 0.6398ms 1.5631 KOps/s 1.5470 KOps/s $\color{#35bf28}+1.04\%$
test_vmap_mlp_speed_decorator[False-True] 0.8093ms 0.5902ms 1.6944 KOps/s 1.7490 KOps/s $\color{#d91a1a}-3.12\%$
test_vmap_mlp_speed_decorator[False-False] 0.8403ms 0.5810ms 1.7213 KOps/s 1.7562 KOps/s $\color{#d91a1a}-1.99\%$
test_vmap_transformer_speed[True-True] 8.1964ms 7.8179ms 127.9113 Ops/s 127.6381 Ops/s $\color{#35bf28}+0.21\%$
test_vmap_transformer_speed[True-False] 8.7152ms 7.7935ms 128.3117 Ops/s 126.8612 Ops/s $\color{#35bf28}+1.14\%$
test_vmap_transformer_speed[False-True] 8.1202ms 7.7260ms 129.4327 Ops/s 128.7429 Ops/s $\color{#35bf28}+0.54\%$
test_vmap_transformer_speed[False-False] 8.9928ms 7.7324ms 129.3268 Ops/s 128.9040 Ops/s $\color{#35bf28}+0.33\%$
test_vmap_transformer_speed_decorator[True-True] 19.3120ms 18.9401ms 52.7980 Ops/s 53.1432 Ops/s $\color{#d91a1a}-0.65\%$
test_vmap_transformer_speed_decorator[True-False] 19.4524ms 18.9806ms 52.6854 Ops/s 52.8034 Ops/s $\color{#d91a1a}-0.22\%$
test_vmap_transformer_speed_decorator[False-True] 19.3481ms 18.8775ms 52.9732 Ops/s 53.2843 Ops/s $\color{#d91a1a}-0.58\%$
test_vmap_transformer_speed_decorator[False-False] 19.2638ms 18.8418ms 53.0734 Ops/s 53.5599 Ops/s $\color{#d91a1a}-0.91\%$
test_to_module_speed[True] 2.7702ms 1.5176ms 658.9489 Ops/s 673.4655 Ops/s $\color{#d91a1a}-2.16\%$
test_to_module_speed[False] 2.0207ms 1.4998ms 666.7486 Ops/s 676.9939 Ops/s $\color{#d91a1a}-1.51\%$
test_tc_init 0.1838ms 54.0653μs 18.4962 KOps/s 21.2928 KOps/s $\textbf{\color{#d91a1a}-13.13\%}$
test_tc_init_nested 0.2637ms 0.1064ms 9.4004 KOps/s 10.2279 KOps/s $\textbf{\color{#d91a1a}-8.09\%}$
test_tc_first_layer_tensor 0.1159ms 3.7535μs 266.4171 KOps/s 266.4222 KOps/s $-0.00\%$
test_tc_first_layer_nontensor 0.1165ms 3.7644μs 265.6486 KOps/s 267.2334 KOps/s $\color{#d91a1a}-0.59\%$
test_tc_second_layer_tensor 0.1172ms 1.2693μs 787.8125 KOps/s 791.8908 KOps/s $\color{#d91a1a}-0.52\%$
test_tc_second_layer_nontensor 54.1510μs 4.2415μs 235.7665 KOps/s 234.6518 KOps/s $\color{#35bf28}+0.48\%$
test_unbind 0.1141s 13.2507ms 75.4679 Ops/s 65.5827 Ops/s $\textbf{\color{#35bf28}+15.07\%}$
test_full_like 9.7365ms 9.3034ms 107.4881 Ops/s 73.6714 Ops/s $\textbf{\color{#35bf28}+45.90\%}$
test_zeros_like 8.5437ms 7.9824ms 125.2763 Ops/s 125.6859 Ops/s $\color{#d91a1a}-0.33\%$
test_ones_like 8.4956ms 8.0468ms 124.2724 Ops/s 123.4873 Ops/s $\color{#35bf28}+0.64\%$
test_clone 9.7952ms 9.4761ms 105.5287 Ops/s 105.6828 Ops/s $\color{#d91a1a}-0.15\%$
test_squeeze 80.7910μs 10.7706μs 92.8456 KOps/s 94.9560 KOps/s $\color{#d91a1a}-2.22\%$
test_unsqueeze 0.2220ms 87.9864μs 11.3654 KOps/s 11.2724 KOps/s $\color{#35bf28}+0.83\%$
test_split 3.4376ms 3.1278ms 319.7161 Ops/s 320.3583 Ops/s $\color{#d91a1a}-0.20\%$
test_permute 0.2960ms 0.2034ms 4.9155 KOps/s 4.8896 KOps/s $\color{#35bf28}+0.53\%$
test_stack 27.8858ms 26.9838ms 37.0593 Ops/s 36.6527 Ops/s $\color{#35bf28}+1.11\%$
test_cat 26.7883ms 26.6322ms 37.5485 Ops/s 37.1157 Ops/s $\color{#35bf28}+1.17\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants