Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Versioning] v0.5 bump #848

Merged
merged 5 commits into from
Jul 3, 2024
Merged

[Versioning] v0.5 bump #848

merged 5 commits into from
Jul 3, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jul 3, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 3, 2024
Copy link

github-actions bot commented Jul 3, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 40.7760μs 17.4868μs 57.1858 KOps/s 57.8973 KOps/s $\color{#d91a1a}-1.23\%$
test_plain_set_stack_nested 45.1050μs 17.7227μs 56.4247 KOps/s 57.3862 KOps/s $\color{#d91a1a}-1.68\%$
test_plain_set_nested_inplace 99.5120μs 19.9333μs 50.1673 KOps/s 51.1641 KOps/s $\color{#d91a1a}-1.95\%$
test_plain_set_stack_nested_inplace 65.4030μs 19.8907μs 50.2747 KOps/s 51.1139 KOps/s $\color{#d91a1a}-1.64\%$
test_items 24.6870μs 2.5795μs 387.6662 KOps/s 385.3614 KOps/s $\color{#35bf28}+0.60\%$
test_items_nested 0.9640ms 0.2815ms 3.5523 KOps/s 3.6103 KOps/s $\color{#d91a1a}-1.61\%$
test_items_nested_locked 0.4519ms 0.2867ms 3.4882 KOps/s 3.5821 KOps/s $\color{#d91a1a}-2.62\%$
test_items_nested_leaf 0.1601ms 78.5900μs 12.7243 KOps/s 12.7210 KOps/s $\color{#35bf28}+0.03\%$
test_items_stack_nested 0.4286ms 0.2782ms 3.5943 KOps/s 3.5754 KOps/s $\color{#35bf28}+0.53\%$
test_items_stack_nested_leaf 0.1567ms 78.7905μs 12.6919 KOps/s 12.9009 KOps/s $\color{#d91a1a}-1.62\%$
test_items_stack_nested_locked 0.6041ms 0.2774ms 3.6046 KOps/s 3.5510 KOps/s $\color{#35bf28}+1.51\%$
test_keys 21.3500μs 3.8369μs 260.6299 KOps/s 262.0480 KOps/s $\color{#d91a1a}-0.54\%$
test_keys_nested 0.2325ms 0.1404ms 7.1202 KOps/s 7.2386 KOps/s $\color{#d91a1a}-1.64\%$
test_keys_nested_locked 1.8592ms 0.1464ms 6.8319 KOps/s 6.9454 KOps/s $\color{#d91a1a}-1.63\%$
test_keys_nested_leaf 0.2058ms 0.1196ms 8.3579 KOps/s 8.3646 KOps/s $\color{#d91a1a}-0.08\%$
test_keys_stack_nested 0.2295ms 0.1391ms 7.1876 KOps/s 7.2351 KOps/s $\color{#d91a1a}-0.66\%$
test_keys_stack_nested_leaf 0.2109ms 0.1179ms 8.4806 KOps/s 8.6476 KOps/s $\color{#d91a1a}-1.93\%$
test_keys_stack_nested_locked 0.2206ms 0.1440ms 6.9438 KOps/s 7.0991 KOps/s $\color{#d91a1a}-2.19\%$
test_values 27.3537μs 1.1576μs 863.8714 KOps/s 842.3310 KOps/s $\color{#35bf28}+2.56\%$
test_values_nested 2.5589ms 50.6972μs 19.7250 KOps/s 19.5329 KOps/s $\color{#35bf28}+0.98\%$
test_values_nested_locked 0.1048ms 50.9662μs 19.6208 KOps/s 19.4686 KOps/s $\color{#35bf28}+0.78\%$
test_values_nested_leaf 2.4641ms 46.2428μs 21.6250 KOps/s 21.4526 KOps/s $\color{#35bf28}+0.80\%$
test_values_stack_nested 0.1078ms 52.1426μs 19.1782 KOps/s 19.0984 KOps/s $\color{#35bf28}+0.42\%$
test_values_stack_nested_leaf 0.2338ms 45.4371μs 22.0084 KOps/s 21.6832 KOps/s $\color{#35bf28}+1.50\%$
test_values_stack_nested_locked 1.9693ms 51.6731μs 19.3524 KOps/s 19.0050 KOps/s $\color{#35bf28}+1.83\%$
test_membership 15.8500μs 1.3723μs 728.7015 KOps/s 740.0227 KOps/s $\color{#d91a1a}-1.53\%$
test_membership_nested 28.9740μs 3.4086μs 293.3782 KOps/s 282.0260 KOps/s $\color{#35bf28}+4.03\%$
test_membership_nested_leaf 46.6180μs 3.4333μs 291.2632 KOps/s 275.5114 KOps/s $\textbf{\color{#35bf28}+5.72\%}$
test_membership_stacked_nested 33.5930μs 3.4064μs 293.5627 KOps/s 260.9690 KOps/s $\textbf{\color{#35bf28}+12.49\%}$
test_membership_stacked_nested_leaf 19.4170μs 3.4557μs 289.3805 KOps/s 279.5771 KOps/s $\color{#35bf28}+3.51\%$
test_membership_nested_last 69.5450μs 4.2536μs 235.0957 KOps/s 235.0988 KOps/s $-0.00\%$
test_membership_nested_leaf_last 35.3270μs 4.2461μs 235.5090 KOps/s 234.0085 KOps/s $\color{#35bf28}+0.64\%$
test_membership_stacked_nested_last 39.5750μs 8.2282μs 121.5331 KOps/s 74.8771 KOps/s $\textbf{\color{#35bf28}+62.31\%}$
test_membership_stacked_nested_leaf_last 42.7710μs 8.2718μs 120.8924 KOps/s 75.6067 KOps/s $\textbf{\color{#35bf28}+59.90\%}$
test_nested_getleaf 34.9660μs 10.8258μs 92.3719 KOps/s 93.1257 KOps/s $\color{#d91a1a}-0.81\%$
test_nested_get 46.1870μs 10.1886μs 98.1493 KOps/s 98.4989 KOps/s $\color{#d91a1a}-0.35\%$
test_stacked_getleaf 74.5650μs 10.7477μs 93.0429 KOps/s 94.4797 KOps/s $\color{#d91a1a}-1.52\%$
test_stacked_get 29.2850μs 10.1022μs 98.9887 KOps/s 100.6955 KOps/s $\color{#d91a1a}-1.70\%$
test_nested_getitemleaf 74.6750μs 11.2639μs 88.7791 KOps/s 88.2836 KOps/s $\color{#35bf28}+0.56\%$
test_nested_getitem 47.2790μs 10.5352μs 94.9196 KOps/s 96.4686 KOps/s $\color{#d91a1a}-1.61\%$
test_stacked_getitemleaf 43.6420μs 11.3266μs 88.2876 KOps/s 88.9724 KOps/s $\color{#d91a1a}-0.77\%$
test_stacked_getitem 60.1080μs 10.4765μs 95.4520 KOps/s 98.3175 KOps/s $\color{#d91a1a}-2.91\%$
test_lock_nested 1.2242ms 0.3392ms 2.9483 KOps/s 2.9522 KOps/s $\color{#d91a1a}-0.13\%$
test_lock_stack_nested 0.4914ms 0.2997ms 3.3367 KOps/s 3.4053 KOps/s $\color{#d91a1a}-2.02\%$
test_unlock_nested 0.9135ms 0.3480ms 2.8734 KOps/s 2.9184 KOps/s $\color{#d91a1a}-1.54\%$
test_unlock_stack_nested 0.4490ms 0.3068ms 3.2595 KOps/s 3.3084 KOps/s $\color{#d91a1a}-1.48\%$
test_flatten_speed 0.6322ms 99.0028μs 10.1007 KOps/s 10.1223 KOps/s $\color{#d91a1a}-0.21\%$
test_unflatten_speed 0.8618ms 0.4074ms 2.4547 KOps/s 2.4295 KOps/s $\color{#35bf28}+1.04\%$
test_common_ops 3.5620ms 0.7606ms 1.3147 KOps/s 1.3374 KOps/s $\color{#d91a1a}-1.69\%$
test_creation 19.1760μs 1.8810μs 531.6339 KOps/s 506.1085 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_creation_empty 40.8270μs 12.1789μs 82.1093 KOps/s 86.9979 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_creation_nested_1 91.0850μs 14.6746μs 68.1449 KOps/s 69.8102 KOps/s $\color{#d91a1a}-2.39\%$
test_creation_nested_2 55.9950μs 18.2318μs 54.8491 KOps/s 56.7860 KOps/s $\color{#d91a1a}-3.41\%$
test_clone 0.1724ms 13.0650μs 76.5405 KOps/s 75.2390 KOps/s $\color{#35bf28}+1.73\%$
test_getitem[int] 34.7250μs 11.0654μs 90.3722 KOps/s 91.0987 KOps/s $\color{#d91a1a}-0.80\%$
test_getitem[slice_int] 85.3000μs 21.8764μs 45.7114 KOps/s 44.5849 KOps/s $\color{#35bf28}+2.53\%$
test_getitem[range] 81.0120μs 57.9430μs 17.2583 KOps/s 15.3998 KOps/s $\textbf{\color{#35bf28}+12.07\%}$
test_getitem[tuple] 59.0410μs 18.5556μs 53.8922 KOps/s 53.2487 KOps/s $\color{#35bf28}+1.21\%$
test_getitem[list] 0.1286ms 39.9483μs 25.0323 KOps/s 24.7814 KOps/s $\color{#35bf28}+1.01\%$
test_setitem_dim[int] 78.5770μs 36.4897μs 27.4050 KOps/s 27.3235 KOps/s $\color{#35bf28}+0.30\%$
test_setitem_dim[slice_int] 0.1025ms 64.1563μs 15.5869 KOps/s 15.8240 KOps/s $\color{#d91a1a}-1.50\%$
test_setitem_dim[range] 0.1513ms 86.8310μs 11.5166 KOps/s 11.4741 KOps/s $\color{#35bf28}+0.37\%$
test_setitem_dim[tuple] 0.1093ms 52.2736μs 19.1301 KOps/s 19.2658 KOps/s $\color{#d91a1a}-0.70\%$
test_setitem 72.2860μs 21.1421μs 47.2991 KOps/s 48.6731 KOps/s $\color{#d91a1a}-2.82\%$
test_set 74.5990μs 20.3123μs 49.2313 KOps/s 50.3743 KOps/s $\color{#d91a1a}-2.27\%$
test_set_shared 1.5745ms 0.1457ms 6.8641 KOps/s 6.6762 KOps/s $\color{#35bf28}+2.82\%$
test_update 0.2048ms 23.5159μs 42.5245 KOps/s 43.0131 KOps/s $\color{#d91a1a}-1.14\%$
test_update_nested 93.0940μs 32.4709μs 30.7968 KOps/s 30.9741 KOps/s $\color{#d91a1a}-0.57\%$
test_update__nested 70.4120μs 24.6599μs 40.5517 KOps/s 40.0930 KOps/s $\color{#35bf28}+1.14\%$
test_set_nested 65.8730μs 22.2295μs 44.9853 KOps/s 46.2049 KOps/s $\color{#d91a1a}-2.64\%$
test_set_nested_new 73.6280μs 26.0836μs 38.3382 KOps/s 38.2786 KOps/s $\color{#35bf28}+0.16\%$
test_select 0.1504ms 40.8667μs 24.4698 KOps/s 24.8416 KOps/s $\color{#d91a1a}-1.50\%$
test_select_nested 0.8833ms 56.8178μs 17.6001 KOps/s 17.3229 KOps/s $\color{#35bf28}+1.60\%$
test_exclude_nested 0.1922ms 0.1188ms 8.4169 KOps/s 8.4109 KOps/s $\color{#35bf28}+0.07\%$
test_empty[True] 0.6578ms 0.3941ms 2.5374 KOps/s 2.5379 KOps/s $\color{#d91a1a}-0.02\%$
test_empty[False] 7.9348μs 1.0191μs 981.2319 KOps/s 970.4186 KOps/s $\color{#35bf28}+1.11\%$
test_unbind_speed 0.3295ms 0.2502ms 3.9973 KOps/s 4.0373 KOps/s $\color{#d91a1a}-0.99\%$
test_unbind_speed_stack0 0.5351ms 0.2430ms 4.1150 KOps/s 4.2160 KOps/s $\color{#d91a1a}-2.39\%$
test_unbind_speed_stack1 86.5440ms 0.7361ms 1.3585 KOps/s 1.4048 KOps/s $\color{#d91a1a}-3.30\%$
test_split 78.9387ms 1.6003ms 624.8855 Ops/s 611.2571 Ops/s $\color{#35bf28}+2.23\%$
test_chunk 78.3521ms 1.6003ms 624.8735 Ops/s 610.0224 Ops/s $\color{#35bf28}+2.43\%$
test_creation[device0] 0.2498ms 85.9032μs 11.6410 KOps/s 11.4880 KOps/s $\color{#35bf28}+1.33\%$
test_creation_from_tensor 5.4880ms 87.4676μs 11.4328 KOps/s 11.2440 KOps/s $\color{#35bf28}+1.68\%$
test_add_one[memmap_tensor0] 0.1103ms 5.3285μs 187.6698 KOps/s 189.1491 KOps/s $\color{#d91a1a}-0.78\%$
test_contiguous[memmap_tensor0] 23.2030μs 0.6343μs 1.5764 MOps/s 1.5828 MOps/s $\color{#d91a1a}-0.40\%$
test_stack[memmap_tensor0] 37.0990μs 3.6122μs 276.8398 KOps/s 282.8951 KOps/s $\color{#d91a1a}-2.14\%$
test_memmaptd_index 1.1085ms 0.2554ms 3.9157 KOps/s 3.9703 KOps/s $\color{#d91a1a}-1.38\%$
test_memmaptd_index_astensor 1.1263ms 0.3304ms 3.0267 KOps/s 3.0800 KOps/s $\color{#d91a1a}-1.73\%$
test_memmaptd_index_op 1.6276ms 0.6477ms 1.5440 KOps/s 1.6000 KOps/s $\color{#d91a1a}-3.50\%$
test_serialize_model 0.1799s 0.1129s 8.8573 Ops/s 8.6839 Ops/s $\color{#35bf28}+2.00\%$
test_serialize_model_pickle 0.4503s 0.3710s 2.6952 Ops/s 2.6356 Ops/s $\color{#35bf28}+2.26\%$
test_serialize_weights 0.1124s 98.6766ms 10.1341 Ops/s 9.6964 Ops/s $\color{#35bf28}+4.51\%$
test_serialize_weights_returnearly 0.1981s 0.1312s 7.6202 Ops/s 7.2468 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_serialize_weights_pickle 1.0086s 0.5858s 1.7069 Ops/s 2.4460 Ops/s $\textbf{\color{#d91a1a}-30.22\%}$
test_serialize_weights_filesystem 0.1034s 94.6388ms 10.5665 Ops/s 10.2539 Ops/s $\color{#35bf28}+3.05\%$
test_serialize_model_filesystem 0.1003s 96.9828ms 10.3111 Ops/s 8.9950 Ops/s $\textbf{\color{#35bf28}+14.63\%}$
test_reshape_pytree 78.1960μs 25.7712μs 38.8030 KOps/s 38.4722 KOps/s $\color{#35bf28}+0.86\%$
test_reshape_td 0.1012ms 33.8186μs 29.5695 KOps/s 29.3038 KOps/s $\color{#35bf28}+0.91\%$
test_view_pytree 73.6780μs 25.8885μs 38.6272 KOps/s 38.6501 KOps/s $\color{#d91a1a}-0.06\%$
test_view_td 0.1142ms 38.7183μs 25.8276 KOps/s 25.5712 KOps/s $\color{#35bf28}+1.00\%$
test_unbind_pytree 76.5730μs 29.2512μs 34.1867 KOps/s 33.2151 KOps/s $\color{#35bf28}+2.92\%$
test_unbind_td 0.4367ms 36.3266μs 27.5281 KOps/s 27.7320 KOps/s $\color{#d91a1a}-0.74\%$
test_split_pytree 91.8520μs 29.4056μs 34.0071 KOps/s 33.6073 KOps/s $\color{#35bf28}+1.19\%$
test_split_td 0.6109ms 39.4117μs 25.3732 KOps/s 24.8884 KOps/s $\color{#35bf28}+1.95\%$
test_add_pytree 87.9450μs 34.8575μs 28.6883 KOps/s 28.6922 KOps/s $\color{#d91a1a}-0.01\%$
test_add_td 0.1726ms 56.9459μs 17.5605 KOps/s 17.5206 KOps/s $\color{#35bf28}+0.23\%$
test_distributed 0.2699ms 0.1069ms 9.3569 KOps/s 9.2915 KOps/s $\color{#35bf28}+0.70\%$
test_tdmodule 47.4690μs 18.9716μs 52.7104 KOps/s 48.5122 KOps/s $\textbf{\color{#35bf28}+8.65\%}$
test_tdmodule_dispatch 72.6960μs 37.4247μs 26.7204 KOps/s 24.5255 KOps/s $\textbf{\color{#35bf28}+8.95\%}$
test_tdseq 52.1780μs 22.6642μs 44.1225 KOps/s 45.0475 KOps/s $\color{#d91a1a}-2.05\%$
test_tdseq_dispatch 90.0290μs 43.8178μs 22.8218 KOps/s 23.8342 KOps/s $\color{#d91a1a}-4.25\%$
test_instantiation_functorch 1.6098ms 1.3348ms 749.1998 Ops/s 757.4950 Ops/s $\color{#d91a1a}-1.10\%$
test_instantiation_td 1.7387ms 1.0337ms 967.3963 Ops/s 967.2824 Ops/s $\color{#35bf28}+0.01\%$
test_exec_functorch 0.3043ms 0.1665ms 6.0078 KOps/s 6.0157 KOps/s $\color{#d91a1a}-0.13\%$
test_exec_functional_call 0.3026ms 0.1489ms 6.7168 KOps/s 6.4838 KOps/s $\color{#35bf28}+3.59\%$
test_exec_td 0.2873ms 0.1451ms 6.8933 KOps/s 6.5479 KOps/s $\textbf{\color{#35bf28}+5.27\%}$
test_exec_td_decorator 1.1029ms 0.2219ms 4.5074 KOps/s 4.3895 KOps/s $\color{#35bf28}+2.69\%$
test_vmap_mlp_speed[True-True] 0.7050ms 0.5005ms 1.9981 KOps/s 2.0031 KOps/s $\color{#d91a1a}-0.25\%$
test_vmap_mlp_speed[True-False] 0.7919ms 0.5011ms 1.9956 KOps/s 2.0123 KOps/s $\color{#d91a1a}-0.83\%$
test_vmap_mlp_speed[False-True] 0.6846ms 0.3972ms 2.5177 KOps/s 2.4878 KOps/s $\color{#35bf28}+1.20\%$
test_vmap_mlp_speed[False-False] 0.5172ms 0.3985ms 2.5094 KOps/s 2.4707 KOps/s $\color{#35bf28}+1.56\%$
test_vmap_mlp_speed_decorator[True-True] 1.0616ms 0.5730ms 1.7453 KOps/s 1.7203 KOps/s $\color{#35bf28}+1.45\%$
test_vmap_mlp_speed_decorator[True-False] 1.0254ms 0.5726ms 1.7466 KOps/s 1.7430 KOps/s $\color{#35bf28}+0.20\%$
test_vmap_mlp_speed_decorator[False-True] 0.8002ms 0.4705ms 2.1254 KOps/s 2.1356 KOps/s $\color{#d91a1a}-0.48\%$
test_vmap_mlp_speed_decorator[False-False] 0.7161ms 0.4706ms 2.1249 KOps/s 2.1346 KOps/s $\color{#d91a1a}-0.46\%$
test_to_module_speed[True] 1.9471ms 1.6775ms 596.1082 Ops/s 586.3354 Ops/s $\color{#35bf28}+1.67\%$
test_to_module_speed[False] 1.8126ms 1.6603ms 602.2960 Ops/s 603.7024 Ops/s $\color{#d91a1a}-0.23\%$
test_tc_init 87.8250μs 32.2142μs 31.0422 KOps/s 32.3054 KOps/s $\color{#d91a1a}-3.91\%$
test_tc_init_nested 0.1598ms 68.7514μs 14.5452 KOps/s 15.7469 KOps/s $\textbf{\color{#d91a1a}-7.63\%}$
test_tc_first_layer_tensor 4.4741μs 0.7142μs 1.4002 MOps/s 1.1710 MOps/s $\textbf{\color{#35bf28}+19.57\%}$
test_tc_first_layer_nontensor 2.5713μs 0.7030μs 1.4225 MOps/s 1.3419 MOps/s $\textbf{\color{#35bf28}+6.01\%}$
test_tc_second_layer_tensor 32.4310μs 1.8902μs 529.0328 KOps/s 521.8837 KOps/s $\color{#35bf28}+1.37\%$
test_tc_second_layer_nontensor 23.6740μs 1.6750μs 597.0031 KOps/s 585.2281 KOps/s $\color{#35bf28}+2.01\%$
test_unbind 90.9004ms 8.8334ms 113.2064 Ops/s 144.9805 Ops/s $\textbf{\color{#d91a1a}-21.92\%}$
test_full_like 20.0914ms 11.9612ms 83.6038 Ops/s 83.3634 Ops/s $\color{#35bf28}+0.29\%$
test_zeros_like 13.5675ms 6.2550ms 159.8719 Ops/s 167.3888 Ops/s $\color{#d91a1a}-4.49\%$
test_ones_like 13.4643ms 6.6507ms 150.3610 Ops/s 156.4220 Ops/s $\color{#d91a1a}-3.87\%$
test_clone 17.2310ms 8.9060ms 112.2841 Ops/s 117.2940 Ops/s $\color{#d91a1a}-4.27\%$
test_squeeze 0.1070ms 14.0200μs 71.3267 KOps/s 71.0909 KOps/s $\color{#35bf28}+0.33\%$
test_unsqueeze 0.1285ms 61.9620μs 16.1389 KOps/s 16.2873 KOps/s $\color{#d91a1a}-0.91\%$
test_split 0.2264ms 0.1149ms 8.7048 KOps/s 8.7069 KOps/s $\color{#d91a1a}-0.02\%$
test_permute 0.2556ms 0.1265ms 7.9057 KOps/s 7.6279 KOps/s $\color{#35bf28}+3.64\%$
test_stack 31.3487ms 25.0942ms 39.8499 Ops/s 40.8156 Ops/s $\color{#d91a1a}-2.37\%$
test_cat 31.1227ms 23.8959ms 41.8481 Ops/s 40.6978 Ops/s $\color{#35bf28}+2.83\%$

Copy link

github-actions bot commented Jul 3, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}30$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1055ms 12.0663μs 82.8754 KOps/s 76.7770 KOps/s $\textbf{\color{#35bf28}+7.94\%}$
test_plain_set_stack_nested 28.3010μs 12.4427μs 80.3686 KOps/s 75.8202 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_plain_set_nested_inplace 41.0110μs 13.5079μs 74.0308 KOps/s 68.7829 KOps/s $\textbf{\color{#35bf28}+7.63\%}$
test_plain_set_stack_nested_inplace 42.3210μs 13.5985μs 73.5377 KOps/s 68.6690 KOps/s $\textbf{\color{#35bf28}+7.09\%}$
test_items 22.0800μs 4.5817μs 218.2578 KOps/s 217.6451 KOps/s $\color{#35bf28}+0.28\%$
test_items_nested 0.3676ms 0.3391ms 2.9488 KOps/s 2.9287 KOps/s $\color{#35bf28}+0.69\%$
test_items_nested_locked 0.3732ms 0.3411ms 2.9314 KOps/s 2.9228 KOps/s $\color{#35bf28}+0.29\%$
test_items_nested_leaf 0.1074ms 82.1650μs 12.1706 KOps/s 12.1390 KOps/s $\color{#35bf28}+0.26\%$
test_items_stack_nested 0.4042ms 0.3451ms 2.8980 KOps/s 2.9597 KOps/s $\color{#d91a1a}-2.08\%$
test_items_stack_nested_leaf 0.1095ms 83.8665μs 11.9237 KOps/s 11.8416 KOps/s $\color{#35bf28}+0.69\%$
test_items_stack_nested_locked 0.3802ms 0.3477ms 2.8758 KOps/s 2.9292 KOps/s $\color{#d91a1a}-1.82\%$
test_keys 25.1920μs 4.3580μs 229.4631 KOps/s 229.2423 KOps/s $\color{#35bf28}+0.10\%$
test_keys_nested 91.8620μs 67.8122μs 14.7466 KOps/s 14.3231 KOps/s $\color{#35bf28}+2.96\%$
test_keys_nested_locked 2.5432ms 75.3214μs 13.2764 KOps/s 13.1936 KOps/s $\color{#35bf28}+0.63\%$
test_keys_nested_leaf 95.2120μs 59.5465μs 16.7936 KOps/s 16.5381 KOps/s $\color{#35bf28}+1.54\%$
test_keys_stack_nested 96.9520μs 69.0875μs 14.4744 KOps/s 14.3476 KOps/s $\color{#35bf28}+0.88\%$
test_keys_stack_nested_leaf 84.7920μs 58.7777μs 17.0132 KOps/s 17.0140 KOps/s $-0.00\%$
test_keys_stack_nested_locked 0.1136ms 74.7654μs 13.3752 KOps/s 13.2579 KOps/s $\color{#35bf28}+0.88\%$
test_values 8.6105μs 1.8382μs 544.0121 KOps/s 549.1308 KOps/s $\color{#d91a1a}-0.93\%$
test_values_nested 64.5110μs 34.8973μs 28.6555 KOps/s 28.3594 KOps/s $\color{#35bf28}+1.04\%$
test_values_nested_locked 61.1010μs 37.2624μs 26.8367 KOps/s 26.8405 KOps/s $\color{#d91a1a}-0.01\%$
test_values_nested_leaf 59.8120μs 31.7377μs 31.5083 KOps/s 31.7239 KOps/s $\color{#d91a1a}-0.68\%$
test_values_stack_nested 68.8710μs 35.3198μs 28.3127 KOps/s 28.1316 KOps/s $\color{#35bf28}+0.64\%$
test_values_stack_nested_leaf 82.1620μs 31.6122μs 31.6334 KOps/s 31.5234 KOps/s $\color{#35bf28}+0.35\%$
test_values_stack_nested_locked 65.1910μs 37.3792μs 26.7528 KOps/s 26.7347 KOps/s $\color{#35bf28}+0.07\%$
test_membership 4.1114μs 0.7225μs 1.3841 MOps/s 1.3519 MOps/s $\color{#35bf28}+2.38\%$
test_membership_nested 32.3200μs 2.5619μs 390.3391 KOps/s 381.4351 KOps/s $\color{#35bf28}+2.33\%$
test_membership_nested_leaf 21.6700μs 2.5648μs 389.8985 KOps/s 383.4508 KOps/s $\color{#35bf28}+1.68\%$
test_membership_stacked_nested 36.7410μs 2.5595μs 390.7015 KOps/s 382.2508 KOps/s $\color{#35bf28}+2.21\%$
test_membership_stacked_nested_leaf 20.4610μs 2.5373μs 394.1252 KOps/s 382.6106 KOps/s $\color{#35bf28}+3.01\%$
test_membership_nested_last 18.7100μs 3.0502μs 327.8518 KOps/s 321.4979 KOps/s $\color{#35bf28}+1.98\%$
test_membership_nested_leaf_last 36.2520μs 3.0568μs 327.1425 KOps/s 318.7654 KOps/s $\color{#35bf28}+2.63\%$
test_membership_stacked_nested_last 39.9510μs 3.0426μs 328.6636 KOps/s 280.9871 KOps/s $\textbf{\color{#35bf28}+16.97\%}$
test_membership_stacked_nested_leaf_last 36.1810μs 3.0486μs 328.0206 KOps/s 281.3089 KOps/s $\textbf{\color{#35bf28}+16.61\%}$
test_nested_getleaf 62.7610μs 8.3574μs 119.6544 KOps/s 119.5464 KOps/s $\color{#35bf28}+0.09\%$
test_nested_get 42.1710μs 7.8215μs 127.8530 KOps/s 127.9343 KOps/s $\color{#d91a1a}-0.06\%$
test_stacked_getleaf 39.5010μs 8.3644μs 119.5537 KOps/s 119.7245 KOps/s $\color{#d91a1a}-0.14\%$
test_stacked_get 22.6400μs 7.8201μs 127.8763 KOps/s 127.3188 KOps/s $\color{#35bf28}+0.44\%$
test_nested_getitemleaf 37.8610μs 8.5564μs 116.8721 KOps/s 117.0174 KOps/s $\color{#d91a1a}-0.12\%$
test_nested_getitem 32.3600μs 7.9976μs 125.0382 KOps/s 125.2520 KOps/s $\color{#d91a1a}-0.17\%$
test_stacked_getitemleaf 24.4300μs 8.5064μs 117.5587 KOps/s 117.6613 KOps/s $\color{#d91a1a}-0.09\%$
test_stacked_getitem 40.5110μs 8.0590μs 124.0847 KOps/s 124.7451 KOps/s $\color{#d91a1a}-0.53\%$
test_lock_nested 58.9489ms 0.4002ms 2.4989 KOps/s 2.5115 KOps/s $\color{#d91a1a}-0.51\%$
test_lock_stack_nested 0.3644ms 0.2952ms 3.3873 KOps/s 3.3695 KOps/s $\color{#35bf28}+0.53\%$
test_unlock_nested 61.2463ms 0.4022ms 2.4866 KOps/s 2.4878 KOps/s $\color{#d91a1a}-0.05\%$
test_unlock_stack_nested 0.3489ms 0.3045ms 3.2844 KOps/s 3.2833 KOps/s $\color{#35bf28}+0.04\%$
test_flatten_speed 0.4058ms 0.1028ms 9.7303 KOps/s 9.8444 KOps/s $\color{#d91a1a}-1.16\%$
test_unflatten_speed 0.3448ms 0.2900ms 3.4488 KOps/s 3.4129 KOps/s $\color{#35bf28}+1.05\%$
test_common_ops 1.0447ms 0.5671ms 1.7633 KOps/s 1.6299 KOps/s $\textbf{\color{#35bf28}+8.19\%}$
test_creation 37.1200μs 1.5871μs 630.0866 KOps/s 634.4693 KOps/s $\color{#d91a1a}-0.69\%$
test_creation_empty 23.7400μs 7.3922μs 135.2775 KOps/s 108.2224 KOps/s $\textbf{\color{#35bf28}+25.00\%}$
test_creation_nested_1 33.2600μs 9.1091μs 109.7802 KOps/s 90.7365 KOps/s $\textbf{\color{#35bf28}+20.99\%}$
test_creation_nested_2 47.7910μs 11.2304μs 89.0440 KOps/s 74.1106 KOps/s $\textbf{\color{#35bf28}+20.15\%}$
test_clone 92.1620μs 11.0934μs 90.1435 KOps/s 81.8758 KOps/s $\textbf{\color{#35bf28}+10.10\%}$
test_getitem[int] 28.3700μs 10.5567μs 94.7269 KOps/s 89.2273 KOps/s $\textbf{\color{#35bf28}+6.16\%}$
test_getitem[slice_int] 41.6310μs 21.5126μs 46.4844 KOps/s 46.5775 KOps/s $\color{#d91a1a}-0.20\%$
test_getitem[range] 66.8410μs 48.7870μs 20.4973 KOps/s 20.2860 KOps/s $\color{#35bf28}+1.04\%$
test_getitem[tuple] 40.0510μs 18.8178μs 53.1412 KOps/s 52.7468 KOps/s $\color{#35bf28}+0.75\%$
test_getitem[list] 0.1453ms 33.9929μs 29.4179 KOps/s 29.3809 KOps/s $\color{#35bf28}+0.13\%$
test_setitem_dim[int] 73.9010μs 28.0796μs 35.6130 KOps/s 32.8504 KOps/s $\textbf{\color{#35bf28}+8.41\%}$
test_setitem_dim[slice_int] 77.4120μs 50.8802μs 19.6540 KOps/s 18.9874 KOps/s $\color{#35bf28}+3.51\%$
test_setitem_dim[range] 99.3720μs 71.0214μs 14.0803 KOps/s 14.0088 KOps/s $\color{#35bf28}+0.51\%$
test_setitem_dim[tuple] 66.2910μs 44.3735μs 22.5360 KOps/s 21.5272 KOps/s $\color{#35bf28}+4.69\%$
test_setitem 45.1610μs 15.4936μs 64.5428 KOps/s 55.0928 KOps/s $\textbf{\color{#35bf28}+17.15\%}$
test_set 53.8010μs 14.5126μs 68.9057 KOps/s 59.3029 KOps/s $\textbf{\color{#35bf28}+16.19\%}$
test_set_shared 1.6666ms 97.5790μs 10.2481 KOps/s 10.1996 KOps/s $\color{#35bf28}+0.48\%$
test_update 91.7120μs 17.6388μs 56.6933 KOps/s 48.8767 KOps/s $\textbf{\color{#35bf28}+15.99\%}$
test_update_nested 69.1510μs 23.4459μs 42.6513 KOps/s 38.4163 KOps/s $\textbf{\color{#35bf28}+11.02\%}$
test_update__nested 63.4010μs 21.3403μs 46.8598 KOps/s 44.8314 KOps/s $\color{#35bf28}+4.52\%$
test_set_nested 59.0910μs 15.5441μs 64.3331 KOps/s 55.8057 KOps/s $\textbf{\color{#35bf28}+15.28\%}$
test_set_nested_new 57.8710μs 18.3680μs 54.4425 KOps/s 47.6065 KOps/s $\textbf{\color{#35bf28}+14.36\%}$
test_select 71.4120μs 31.1638μs 32.0885 KOps/s 29.9708 KOps/s $\textbf{\color{#35bf28}+7.07\%}$
test_select_nested 0.4529ms 50.9990μs 19.6082 KOps/s 19.1784 KOps/s $\color{#35bf28}+2.24\%$
test_exclude_nested 0.1417ms 0.1068ms 9.3643 KOps/s 9.0528 KOps/s $\color{#35bf28}+3.44\%$
test_empty[True] 0.3972ms 0.3428ms 2.9172 KOps/s 2.8661 KOps/s $\color{#35bf28}+1.79\%$
test_empty[False] 3.1061μs 0.8396μs 1.1910 MOps/s 1.2079 MOps/s $\color{#d91a1a}-1.39\%$
test_to 89.1320μs 58.3480μs 17.1385 KOps/s 17.0038 KOps/s $\color{#35bf28}+0.79\%$
test_to_nonblocking 79.2410μs 36.0355μs 27.7504 KOps/s 28.7750 KOps/s $\color{#d91a1a}-3.56\%$
test_unbind_speed 1.5801ms 0.2584ms 3.8701 KOps/s 3.7961 KOps/s $\color{#35bf28}+1.95\%$
test_unbind_speed_stack0 0.3236ms 0.2574ms 3.8854 KOps/s 3.6915 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_unbind_speed_stack1 76.4954ms 0.7769ms 1.2871 KOps/s 1.2742 KOps/s $\color{#35bf28}+1.01\%$
test_split 76.6317ms 1.6352ms 611.5513 Ops/s 596.7821 Ops/s $\color{#35bf28}+2.47\%$
test_chunk 76.7409ms 1.6275ms 614.4391 Ops/s 610.6804 Ops/s $\color{#35bf28}+0.62\%$
test_creation[device0] 0.1369ms 56.6034μs 17.6668 KOps/s 17.6169 KOps/s $\color{#35bf28}+0.28\%$
test_creation_from_tensor 0.1326ms 53.1281μs 18.8224 KOps/s 17.4518 KOps/s $\textbf{\color{#35bf28}+7.85\%}$
test_add_one[memmap_tensor0] 73.6110μs 6.6404μs 150.5934 KOps/s 150.9850 KOps/s $\color{#d91a1a}-0.26\%$
test_contiguous[memmap_tensor0] 11.4510μs 0.6643μs 1.5053 MOps/s 1.4683 MOps/s $\color{#35bf28}+2.52\%$
test_stack[memmap_tensor0] 19.2800μs 4.8297μs 207.0511 KOps/s 210.4842 KOps/s $\color{#d91a1a}-1.63\%$
test_memmaptd_index 0.9883ms 0.2754ms 3.6316 KOps/s 3.6270 KOps/s $\color{#35bf28}+0.13\%$
test_memmaptd_index_astensor 0.6407ms 0.3344ms 2.9907 KOps/s 2.9821 KOps/s $\color{#35bf28}+0.29\%$
test_memmaptd_index_op 1.0053ms 0.6077ms 1.6455 KOps/s 1.5819 KOps/s $\color{#35bf28}+4.02\%$
test_serialize_model 99.8213ms 96.2107ms 10.3939 Ops/s 10.0416 Ops/s $\color{#35bf28}+3.51\%$
test_serialize_model_pickle 1.3511s 1.2359s 0.8091 Ops/s 0.8088 Ops/s $\color{#35bf28}+0.05\%$
test_serialize_weights 97.9463ms 93.8256ms 10.6581 Ops/s 9.2265 Ops/s $\textbf{\color{#35bf28}+15.52\%}$
test_serialize_weights_returnearly 0.2142s 83.0631ms 12.0390 Ops/s 11.4956 Ops/s $\color{#35bf28}+4.73\%$
test_serialize_weights_pickle 1.3460s 1.2352s 0.8096 Ops/s 0.8044 Ops/s $\color{#35bf28}+0.65\%$
test_reshape_pytree 56.0910μs 25.8949μs 38.6176 KOps/s 38.2936 KOps/s $\color{#35bf28}+0.85\%$
test_reshape_td 67.5520μs 32.2199μs 31.0367 KOps/s 31.5624 KOps/s $\color{#d91a1a}-1.67\%$
test_view_pytree 0.2430ms 26.3742μs 37.9159 KOps/s 39.4365 KOps/s $\color{#d91a1a}-3.86\%$
test_view_td 64.5810μs 36.7838μs 27.1859 KOps/s 27.1965 KOps/s $\color{#d91a1a}-0.04\%$
test_unbind_pytree 0.2307ms 32.0740μs 31.1779 KOps/s 31.6468 KOps/s $\color{#d91a1a}-1.48\%$
test_unbind_td 0.4410ms 39.6028μs 25.2508 KOps/s 25.4661 KOps/s $\color{#d91a1a}-0.85\%$
test_split_pytree 0.2760ms 38.0041μs 26.3130 KOps/s 26.5757 KOps/s $\color{#d91a1a}-0.99\%$
test_split_td 0.1722ms 38.9162μs 25.6962 KOps/s 26.3241 KOps/s $\color{#d91a1a}-2.39\%$
test_add_pytree 69.6420μs 38.2322μs 26.1560 KOps/s 27.6741 KOps/s $\textbf{\color{#d91a1a}-5.49\%}$
test_add_td 0.2899ms 47.8709μs 20.8895 KOps/s 20.7811 KOps/s $\color{#35bf28}+0.52\%$
test_distributed 0.9743ms 72.2924μs 13.8327 KOps/s 14.1303 KOps/s $\color{#d91a1a}-2.11\%$
test_tdmodule 75.8520μs 15.0004μs 66.6648 KOps/s 63.9710 KOps/s $\color{#35bf28}+4.21\%$
test_tdmodule_dispatch 51.0210μs 28.6107μs 34.9519 KOps/s 33.1775 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_tdseq 31.4510μs 16.1901μs 61.7660 KOps/s 58.8691 KOps/s $\color{#35bf28}+4.92\%$
test_tdseq_dispatch 55.5710μs 33.0125μs 30.2916 KOps/s 30.0629 KOps/s $\color{#35bf28}+0.76\%$
test_instantiation_functorch 1.4885ms 1.4181ms 705.1478 Ops/s 714.5185 Ops/s $\color{#d91a1a}-1.31\%$
test_instantiation_td 79.4214ms 1.0856ms 921.1445 Ops/s 1.0243 KOps/s $\textbf{\color{#d91a1a}-10.07\%}$
test_exec_functorch 0.2279ms 0.1468ms 6.8108 KOps/s 7.0104 KOps/s $\color{#d91a1a}-2.85\%$
test_exec_functional_call 0.1879ms 0.1338ms 7.4763 KOps/s 7.7960 KOps/s $\color{#d91a1a}-4.10\%$
test_exec_td 0.1985ms 0.1316ms 7.5984 KOps/s 7.9281 KOps/s $\color{#d91a1a}-4.16\%$
test_exec_td_decorator 0.7935ms 0.2062ms 4.8507 KOps/s 4.9481 KOps/s $\color{#d91a1a}-1.97\%$
test_vmap_mlp_speed[True-True] 0.7065ms 0.5652ms 1.7692 KOps/s 1.7690 KOps/s $\color{#35bf28}+0.01\%$
test_vmap_mlp_speed[True-False] 1.2797ms 0.5651ms 1.7695 KOps/s 1.7632 KOps/s $\color{#35bf28}+0.36\%$
test_vmap_mlp_speed[False-True] 0.8272ms 0.5031ms 1.9876 KOps/s 1.8956 KOps/s $\color{#35bf28}+4.85\%$
test_vmap_mlp_speed[False-False] 0.7033ms 0.4940ms 2.0241 KOps/s 1.9602 KOps/s $\color{#35bf28}+3.26\%$
test_vmap_mlp_speed_decorator[True-True] 0.9101ms 0.6303ms 1.5865 KOps/s 1.5931 KOps/s $\color{#d91a1a}-0.41\%$
test_vmap_mlp_speed_decorator[True-False] 0.8436ms 0.6275ms 1.5936 KOps/s 1.5962 KOps/s $\color{#d91a1a}-0.16\%$
test_vmap_mlp_speed_decorator[False-True] 0.7746ms 0.5550ms 1.8017 KOps/s 1.8099 KOps/s $\color{#d91a1a}-0.45\%$
test_vmap_mlp_speed_decorator[False-False] 0.7572ms 0.5557ms 1.7996 KOps/s 1.8123 KOps/s $\color{#d91a1a}-0.70\%$
test_vmap_transformer_speed[True-True] 7.7079ms 7.4342ms 134.5129 Ops/s 134.4408 Ops/s $\color{#35bf28}+0.05\%$
test_vmap_transformer_speed[True-False] 8.5667ms 7.6304ms 131.0549 Ops/s 133.6932 Ops/s $\color{#d91a1a}-1.97\%$
test_vmap_transformer_speed[False-True] 8.0918ms 7.5345ms 132.7224 Ops/s 136.0689 Ops/s $\color{#d91a1a}-2.46\%$
test_vmap_transformer_speed[False-False] 7.4948ms 7.3094ms 136.8104 Ops/s 136.1486 Ops/s $\color{#35bf28}+0.49\%$
test_vmap_transformer_speed_decorator[True-True] 17.9830ms 17.9081ms 55.8407 Ops/s 54.5061 Ops/s $\color{#35bf28}+2.45\%$
test_vmap_transformer_speed_decorator[True-False] 18.1747ms 17.9349ms 55.7571 Ops/s 55.6750 Ops/s $\color{#35bf28}+0.15\%$
test_vmap_transformer_speed_decorator[False-True] 17.9470ms 17.7258ms 56.4150 Ops/s 55.6636 Ops/s $\color{#35bf28}+1.35\%$
test_vmap_transformer_speed_decorator[False-False] 18.4502ms 17.7947ms 56.1966 Ops/s 55.9376 Ops/s $\color{#35bf28}+0.46\%$
test_to_module_speed[True] 1.6131ms 1.5009ms 666.2806 Ops/s 669.2818 Ops/s $\color{#d91a1a}-0.45\%$
test_to_module_speed[False] 1.5876ms 1.4836ms 674.0280 Ops/s 682.8866 Ops/s $\color{#d91a1a}-1.30\%$
test_tc_init 41.3800μs 22.6447μs 44.1604 KOps/s 38.4013 KOps/s $\textbf{\color{#35bf28}+15.00\%}$
test_tc_init_nested 71.6510μs 49.1727μs 20.3365 KOps/s 18.1998 KOps/s $\textbf{\color{#35bf28}+11.74\%}$
test_tc_first_layer_tensor 0.7895μs 0.3707μs 2.6973 MOps/s 2.6118 MOps/s $\color{#35bf28}+3.27\%$
test_tc_first_layer_nontensor 2.2485μs 0.4017μs 2.4892 MOps/s 2.4779 MOps/s $\color{#35bf28}+0.46\%$
test_tc_second_layer_tensor 3.6660μs 1.0037μs 996.2702 KOps/s 978.2449 KOps/s $\color{#35bf28}+1.84\%$
test_tc_second_layer_nontensor 3.3968μs 0.8521μs 1.1736 MOps/s 1.1484 MOps/s $\color{#35bf28}+2.19\%$
test_unbind 0.1073s 7.9946ms 125.0842 Ops/s 146.0244 Ops/s $\textbf{\color{#d91a1a}-14.34\%}$
test_full_like 11.4563ms 11.1602ms 89.6045 Ops/s 74.2490 Ops/s $\textbf{\color{#35bf28}+20.68\%}$
test_zeros_like 7.0886ms 6.9132ms 144.6516 Ops/s 125.8041 Ops/s $\textbf{\color{#35bf28}+14.98\%}$
test_ones_like 7.1171ms 6.9213ms 144.4825 Ops/s 125.8167 Ops/s $\textbf{\color{#35bf28}+14.84\%}$
test_clone 9.0264ms 8.6445ms 115.6798 Ops/s 104.8557 Ops/s $\textbf{\color{#35bf28}+10.32\%}$
test_squeeze 72.2310μs 11.4353μs 87.4483 KOps/s 92.5922 KOps/s $\textbf{\color{#d91a1a}-5.56\%}$
test_unsqueeze 96.2620μs 52.9851μs 18.8732 KOps/s 19.3836 KOps/s $\color{#d91a1a}-2.63\%$
test_split 0.1380ms 97.0660μs 10.3023 KOps/s 10.1798 KOps/s $\color{#35bf28}+1.20\%$
test_permute 0.1523ms 0.1142ms 8.7574 KOps/s 8.8696 KOps/s $\color{#d91a1a}-1.27\%$
test_stack 28.6752ms 27.2581ms 36.6863 Ops/s 36.6974 Ops/s $\color{#d91a1a}-0.03\%$
test_cat 27.4208ms 27.1219ms 36.8706 Ops/s 36.8960 Ops/s $\color{#d91a1a}-0.07\%$

@vmoens vmoens merged commit 4838205 into main Jul 3, 2024
41 of 43 checks passed
@vmoens vmoens deleted the v0.5-bump branch July 3, 2024 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. versioning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants