Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix _make_dtype_promotion backward compat #842

Merged
merged 1 commit into from
Jun 27, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 27, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 27, 2024
@vmoens vmoens merged commit 0fa000c into main Jun 27, 2024
3 checks passed
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 55.3540μs 17.7103μs 56.4644 KOps/s 60.0864 KOps/s $\textbf{\color{#d91a1a}-6.03\%}$
test_plain_set_stack_nested 63.0680μs 17.7788μs 56.2466 KOps/s 60.0870 KOps/s $\textbf{\color{#d91a1a}-6.39\%}$
test_plain_set_nested_inplace 45.7660μs 19.7439μs 50.6486 KOps/s 52.6535 KOps/s $\color{#d91a1a}-3.81\%$
test_plain_set_stack_nested_inplace 67.1860μs 19.7532μs 50.6246 KOps/s 53.0655 KOps/s $\color{#d91a1a}-4.60\%$
test_items 27.7420μs 2.6286μs 380.4258 KOps/s 362.4521 KOps/s $\color{#35bf28}+4.96\%$
test_items_nested 0.4451ms 0.2649ms 3.7743 KOps/s 3.8171 KOps/s $\color{#d91a1a}-1.12\%$
test_items_nested_locked 1.3731ms 0.2655ms 3.7661 KOps/s 3.7867 KOps/s $\color{#d91a1a}-0.54\%$
test_items_nested_leaf 0.1553ms 76.4910μs 13.0734 KOps/s 12.8656 KOps/s $\color{#35bf28}+1.62\%$
test_items_stack_nested 0.5304ms 0.2651ms 3.7721 KOps/s 3.7818 KOps/s $\color{#d91a1a}-0.26\%$
test_items_stack_nested_leaf 0.1345ms 77.3703μs 12.9249 KOps/s 12.5115 KOps/s $\color{#35bf28}+3.30\%$
test_items_stack_nested_locked 1.2965ms 0.2653ms 3.7688 KOps/s 3.7486 KOps/s $\color{#35bf28}+0.54\%$
test_keys 19.8470μs 3.8461μs 260.0016 KOps/s 256.8036 KOps/s $\color{#35bf28}+1.25\%$
test_keys_nested 0.2711ms 0.1400ms 7.1420 KOps/s 7.1726 KOps/s $\color{#d91a1a}-0.43\%$
test_keys_nested_locked 0.6855ms 0.1452ms 6.8865 KOps/s 6.9369 KOps/s $\color{#d91a1a}-0.73\%$
test_keys_nested_leaf 0.2021ms 0.1192ms 8.3891 KOps/s 8.5311 KOps/s $\color{#d91a1a}-1.66\%$
test_keys_stack_nested 0.3509ms 0.1412ms 7.0839 KOps/s 7.2263 KOps/s $\color{#d91a1a}-1.97\%$
test_keys_stack_nested_leaf 0.2171ms 0.1194ms 8.3767 KOps/s 8.5227 KOps/s $\color{#d91a1a}-1.71\%$
test_keys_stack_nested_locked 0.2727ms 0.1450ms 6.8966 KOps/s 7.0494 KOps/s $\color{#d91a1a}-2.17\%$
test_values 8.4940μs 1.1295μs 885.3185 KOps/s 867.1205 KOps/s $\color{#35bf28}+2.10\%$
test_values_nested 0.2329ms 51.6144μs 19.3744 KOps/s 19.5584 KOps/s $\color{#d91a1a}-0.94\%$
test_values_nested_locked 0.1043ms 51.1561μs 19.5480 KOps/s 19.7291 KOps/s $\color{#d91a1a}-0.92\%$
test_values_nested_leaf 92.3430μs 46.1655μs 21.6612 KOps/s 21.8144 KOps/s $\color{#d91a1a}-0.70\%$
test_values_stack_nested 0.1049ms 51.7077μs 19.3395 KOps/s 19.1167 KOps/s $\color{#35bf28}+1.17\%$
test_values_stack_nested_leaf 95.0890μs 46.5551μs 21.4799 KOps/s 21.6953 KOps/s $\color{#d91a1a}-0.99\%$
test_values_stack_nested_locked 99.3370μs 51.7264μs 19.3325 KOps/s 19.0811 KOps/s $\color{#35bf28}+1.32\%$
test_membership 16.9720μs 1.3425μs 744.8538 KOps/s 749.1682 KOps/s $\color{#d91a1a}-0.58\%$
test_membership_nested 39.7040μs 3.4151μs 292.8194 KOps/s 283.6397 KOps/s $\color{#35bf28}+3.24\%$
test_membership_nested_leaf 26.3400μs 3.4368μs 290.9677 KOps/s 290.4522 KOps/s $\color{#35bf28}+0.18\%$
test_membership_stacked_nested 40.3760μs 3.3864μs 295.2984 KOps/s 286.6216 KOps/s $\color{#35bf28}+3.03\%$
test_membership_stacked_nested_leaf 23.2240μs 3.4162μs 292.7258 KOps/s 296.4102 KOps/s $\color{#d91a1a}-1.24\%$
test_membership_nested_last 39.3340μs 4.1707μs 239.7695 KOps/s 241.5917 KOps/s $\color{#d91a1a}-0.75\%$
test_membership_nested_leaf_last 32.1900μs 4.2040μs 237.8680 KOps/s 238.4623 KOps/s $\color{#d91a1a}-0.25\%$
test_membership_stacked_nested_last 22.5120μs 4.1620μs 240.2674 KOps/s 243.9938 KOps/s $\color{#d91a1a}-1.53\%$
test_membership_stacked_nested_leaf_last 29.8160μs 4.0712μs 245.6257 KOps/s 241.9504 KOps/s $\color{#35bf28}+1.52\%$
test_nested_getleaf 49.6740μs 10.6022μs 94.3200 KOps/s 96.6881 KOps/s $\color{#d91a1a}-2.45\%$
test_nested_get 39.9950μs 10.1757μs 98.2731 KOps/s 101.4749 KOps/s $\color{#d91a1a}-3.16\%$
test_stacked_getleaf 26.5000μs 10.5995μs 94.3442 KOps/s 97.2695 KOps/s $\color{#d91a1a}-3.01\%$
test_stacked_get 44.4750μs 9.9757μs 100.2436 KOps/s 102.0948 KOps/s $\color{#d91a1a}-1.81\%$
test_nested_getitemleaf 36.2080μs 11.2487μs 88.8993 KOps/s 91.2769 KOps/s $\color{#d91a1a}-2.60\%$
test_nested_getitem 50.1640μs 10.4380μs 95.8034 KOps/s 99.1377 KOps/s $\color{#d91a1a}-3.36\%$
test_stacked_getitemleaf 77.7660μs 11.1530μs 89.6622 KOps/s 91.2473 KOps/s $\color{#d91a1a}-1.74\%$
test_stacked_getitem 50.9150μs 10.3734μs 96.4007 KOps/s 100.6281 KOps/s $\color{#d91a1a}-4.20\%$
test_lock_nested 48.0267ms 0.3930ms 2.5448 KOps/s 2.9545 KOps/s $\textbf{\color{#d91a1a}-13.87\%}$
test_lock_stack_nested 0.7173ms 0.3158ms 3.1663 KOps/s 3.2830 KOps/s $\color{#d91a1a}-3.55\%$
test_unlock_nested 0.6665ms 0.3460ms 2.8903 KOps/s 2.9081 KOps/s $\color{#d91a1a}-0.61\%$
test_unlock_stack_nested 0.6596ms 0.3244ms 3.0831 KOps/s 3.1829 KOps/s $\color{#d91a1a}-3.14\%$
test_flatten_speed 0.1967ms 95.9391μs 10.4233 KOps/s 10.4457 KOps/s $\color{#d91a1a}-0.21\%$
test_unflatten_speed 0.7220ms 0.4165ms 2.4008 KOps/s 2.4253 KOps/s $\color{#d91a1a}-1.01\%$
test_common_ops 3.8861ms 0.7522ms 1.3295 KOps/s 1.3956 KOps/s $\color{#d91a1a}-4.73\%$
test_creation 49.7930μs 1.9619μs 509.7128 KOps/s 532.4408 KOps/s $\color{#d91a1a}-4.27\%$
test_creation_empty 31.8400μs 11.5150μs 86.8432 KOps/s 100.4176 KOps/s $\textbf{\color{#d91a1a}-13.52\%}$
test_creation_nested_1 57.1270μs 14.3477μs 69.6974 KOps/s 78.3909 KOps/s $\textbf{\color{#d91a1a}-11.09\%}$
test_creation_nested_2 43.6720μs 17.3746μs 57.5552 KOps/s 62.9438 KOps/s $\textbf{\color{#d91a1a}-8.56\%}$
test_clone 77.1450μs 13.3070μs 75.1483 KOps/s 74.0075 KOps/s $\color{#35bf28}+1.54\%$
test_getitem[int] 47.3190μs 11.5925μs 86.2627 KOps/s 88.3975 KOps/s $\color{#d91a1a}-2.41\%$
test_getitem[slice_int] 80.8330μs 22.8693μs 43.7267 KOps/s 44.8287 KOps/s $\color{#d91a1a}-2.46\%$
test_getitem[range] 79.2090μs 58.7404μs 17.0241 KOps/s 16.8546 KOps/s $\color{#35bf28}+1.01\%$
test_getitem[tuple] 53.4000μs 18.9926μs 52.6521 KOps/s 53.2898 KOps/s $\color{#d91a1a}-1.20\%$
test_getitem[list] 0.1069ms 40.7528μs 24.5382 KOps/s 24.2163 KOps/s $\color{#35bf28}+1.33\%$
test_setitem_dim[int] 96.0200μs 37.1283μs 26.9336 KOps/s 29.2298 KOps/s $\textbf{\color{#d91a1a}-7.86\%}$
test_setitem_dim[slice_int] 0.1017ms 64.8672μs 15.4161 KOps/s 16.5797 KOps/s $\textbf{\color{#d91a1a}-7.02\%}$
test_setitem_dim[range] 0.1476ms 87.0964μs 11.4815 KOps/s 12.0045 KOps/s $\color{#d91a1a}-4.36\%$
test_setitem_dim[tuple] 0.1009ms 53.1101μs 18.8288 KOps/s 20.3930 KOps/s $\textbf{\color{#d91a1a}-7.67\%}$
test_setitem 80.4200μs 21.5403μs 46.4245 KOps/s 48.7903 KOps/s $\color{#d91a1a}-4.85\%$
test_set 85.4700μs 20.7183μs 48.2666 KOps/s 50.1943 KOps/s $\color{#d91a1a}-3.84\%$
test_set_shared 3.5246ms 0.1463ms 6.8363 KOps/s 6.9299 KOps/s $\color{#d91a1a}-1.35\%$
test_update 81.3130μs 23.9184μs 41.8088 KOps/s 45.4550 KOps/s $\textbf{\color{#d91a1a}-8.02\%}$
test_update_nested 78.4870μs 32.7385μs 30.5451 KOps/s 31.7734 KOps/s $\color{#d91a1a}-3.87\%$
test_update__nested 79.9500μs 25.7129μs 38.8911 KOps/s 39.2786 KOps/s $\color{#d91a1a}-0.99\%$
test_set_nested 58.4000μs 22.3441μs 44.7544 KOps/s 46.3686 KOps/s $\color{#d91a1a}-3.48\%$
test_set_nested_new 88.9260μs 26.5642μs 37.6447 KOps/s 38.8877 KOps/s $\color{#d91a1a}-3.20\%$
test_select 0.1026ms 42.6779μs 23.4314 KOps/s 24.2199 KOps/s $\color{#d91a1a}-3.26\%$
test_select_nested 0.1231ms 61.0495μs 16.3801 KOps/s 16.6230 KOps/s $\color{#d91a1a}-1.46\%$
test_exclude_nested 0.2637ms 0.1225ms 8.1649 KOps/s 8.2924 KOps/s $\color{#d91a1a}-1.54\%$
test_empty[True] 6.4191ms 0.4054ms 2.4665 KOps/s 2.5388 KOps/s $\color{#d91a1a}-2.85\%$
test_empty[False] 8.5285μs 1.1689μs 855.5265 KOps/s 883.4319 KOps/s $\color{#d91a1a}-3.16\%$
test_unbind_speed 0.3333ms 0.2609ms 3.8324 KOps/s 3.9025 KOps/s $\color{#d91a1a}-1.80\%$
test_unbind_speed_stack0 0.4723ms 0.2588ms 3.8647 KOps/s 4.0350 KOps/s $\color{#d91a1a}-4.22\%$
test_unbind_speed_stack1 65.4266ms 0.7319ms 1.3663 KOps/s 1.4080 KOps/s $\color{#d91a1a}-2.96\%$
test_split 70.8134ms 1.5999ms 625.0545 Ops/s 625.5295 Ops/s $\color{#d91a1a}-0.08\%$
test_chunk 66.5128ms 1.6001ms 624.9447 Ops/s 627.5064 Ops/s $\color{#d91a1a}-0.41\%$
test_creation[device0] 0.1688ms 84.6147μs 11.8183 KOps/s 11.8245 KOps/s $\color{#d91a1a}-0.05\%$
test_creation_from_tensor 3.4892ms 87.2343μs 11.4634 KOps/s 11.6895 KOps/s $\color{#d91a1a}-1.93\%$
test_add_one[memmap_tensor0] 57.2470μs 5.3478μs 186.9912 KOps/s 179.6468 KOps/s $\color{#35bf28}+4.09\%$
test_contiguous[memmap_tensor0] 19.6370μs 0.6765μs 1.4783 MOps/s 1.5835 MOps/s $\textbf{\color{#d91a1a}-6.65\%}$
test_stack[memmap_tensor0] 27.0200μs 3.6185μs 276.3542 KOps/s 278.5017 KOps/s $\color{#d91a1a}-0.77\%$
test_memmaptd_index 0.9465ms 0.2577ms 3.8804 KOps/s 3.8898 KOps/s $\color{#d91a1a}-0.24\%$
test_memmaptd_index_astensor 1.0549ms 0.3348ms 2.9867 KOps/s 3.0124 KOps/s $\color{#d91a1a}-0.86\%$
test_memmaptd_index_op 0.9225ms 0.6334ms 1.5787 KOps/s 1.6559 KOps/s $\color{#d91a1a}-4.66\%$
test_serialize_model 0.1641s 0.1113s 8.9875 Ops/s 8.8126 Ops/s $\color{#35bf28}+1.99\%$
test_serialize_model_pickle 0.4479s 0.3749s 2.6674 Ops/s 2.6364 Ops/s $\color{#35bf28}+1.18\%$
test_serialize_weights 0.1674s 0.1130s 8.8481 Ops/s 9.2638 Ops/s $\color{#d91a1a}-4.49\%$
test_serialize_weights_returnearly 0.1905s 0.1338s 7.4739 Ops/s 7.2129 Ops/s $\color{#35bf28}+3.62\%$
test_serialize_weights_pickle 0.8452s 0.5165s 1.9362 Ops/s 2.2778 Ops/s $\textbf{\color{#d91a1a}-15.00\%}$
test_serialize_weights_filesystem 0.1520s 97.6986ms 10.2356 Ops/s 10.4918 Ops/s $\color{#d91a1a}-2.44\%$
test_serialize_model_filesystem 0.1003s 92.9669ms 10.7565 Ops/s 9.7724 Ops/s $\textbf{\color{#35bf28}+10.07\%}$
test_reshape_pytree 91.3840μs 25.9002μs 38.6097 KOps/s 38.6569 KOps/s $\color{#d91a1a}-0.12\%$
test_reshape_td 78.2370μs 34.7075μs 28.8122 KOps/s 29.2915 KOps/s $\color{#d91a1a}-1.64\%$
test_view_pytree 57.9490μs 25.5521μs 39.1357 KOps/s 38.8632 KOps/s $\color{#35bf28}+0.70\%$
test_view_td 83.5270μs 40.0234μs 24.9854 KOps/s 25.5857 KOps/s $\color{#d91a1a}-2.35\%$
test_unbind_pytree 77.9660μs 29.6111μs 33.7711 KOps/s 33.4877 KOps/s $\color{#35bf28}+0.85\%$
test_unbind_td 0.4280ms 38.1829μs 26.1898 KOps/s 26.5741 KOps/s $\color{#d91a1a}-1.45\%$
test_split_pytree 69.7110μs 29.3253μs 34.1002 KOps/s 33.3340 KOps/s $\color{#35bf28}+2.30\%$
test_split_td 0.5354ms 40.7229μs 24.5562 KOps/s 24.7232 KOps/s $\color{#d91a1a}-0.68\%$
test_add_pytree 79.6890μs 35.3846μs 28.2609 KOps/s 27.9761 KOps/s $\color{#35bf28}+1.02\%$
test_add_td 0.1777ms 59.6394μs 16.7674 KOps/s 18.4779 KOps/s $\textbf{\color{#d91a1a}-9.26\%}$
test_distributed 0.2173ms 99.8316μs 10.0169 KOps/s 9.7284 KOps/s $\color{#35bf28}+2.97\%$
test_tdmodule 41.7680μs 18.0518μs 55.3962 KOps/s 56.3438 KOps/s $\color{#d91a1a}-1.68\%$
test_tdmodule_dispatch 69.4000μs 35.9701μs 27.8008 KOps/s 28.5292 KOps/s $\color{#d91a1a}-2.55\%$
test_tdseq 40.7170μs 20.7750μs 48.1348 KOps/s 47.9493 KOps/s $\color{#35bf28}+0.39\%$
test_tdseq_dispatch 75.5810μs 40.3784μs 24.7657 KOps/s 24.6810 KOps/s $\color{#35bf28}+0.34\%$
test_instantiation_functorch 1.5059ms 1.3138ms 761.1327 Ops/s 762.1626 Ops/s $\color{#d91a1a}-0.14\%$
test_instantiation_td 1.8900ms 1.0364ms 964.8747 Ops/s 977.2380 Ops/s $\color{#d91a1a}-1.27\%$
test_exec_functorch 0.3597ms 0.1589ms 6.2950 KOps/s 6.1273 KOps/s $\color{#35bf28}+2.74\%$
test_exec_functional_call 0.2970ms 0.1504ms 6.6489 KOps/s 6.6297 KOps/s $\color{#35bf28}+0.29\%$
test_exec_td 0.2932ms 0.1466ms 6.8234 KOps/s 6.8598 KOps/s $\color{#d91a1a}-0.53\%$
test_exec_td_decorator 0.9000ms 0.2237ms 4.4712 KOps/s 4.5075 KOps/s $\color{#d91a1a}-0.80\%$
test_vmap_mlp_speed[True-True] 0.7853ms 0.4928ms 2.0293 KOps/s 2.0365 KOps/s $\color{#d91a1a}-0.36\%$
test_vmap_mlp_speed[True-False] 0.7630ms 0.4914ms 2.0349 KOps/s 2.0655 KOps/s $\color{#d91a1a}-1.48\%$
test_vmap_mlp_speed[False-True] 0.6016ms 0.3987ms 2.5081 KOps/s 2.5003 KOps/s $\color{#35bf28}+0.31\%$
test_vmap_mlp_speed[False-False] 0.5918ms 0.3979ms 2.5130 KOps/s 2.5136 KOps/s $\color{#d91a1a}-0.03\%$
test_vmap_mlp_speed_decorator[True-True] 1.0859ms 0.5628ms 1.7767 KOps/s 1.7977 KOps/s $\color{#d91a1a}-1.17\%$
test_vmap_mlp_speed_decorator[True-False] 0.8932ms 0.5653ms 1.7688 KOps/s 1.7915 KOps/s $\color{#d91a1a}-1.27\%$
test_vmap_mlp_speed_decorator[False-True] 0.6652ms 0.4601ms 2.1734 KOps/s 2.1808 KOps/s $\color{#d91a1a}-0.34\%$
test_vmap_mlp_speed_decorator[False-False] 0.7507ms 0.4612ms 2.1683 KOps/s 2.1777 KOps/s $\color{#d91a1a}-0.43\%$
test_to_module_speed[True] 73.6143ms 1.8164ms 550.5355 Ops/s 603.7795 Ops/s $\textbf{\color{#d91a1a}-8.82\%}$
test_to_module_speed[False] 2.2824ms 1.6636ms 601.1061 Ops/s 617.5449 Ops/s $\color{#d91a1a}-2.66\%$
test_tc_init 73.6880μs 31.0885μs 32.1662 KOps/s 35.3131 KOps/s $\textbf{\color{#d91a1a}-8.91\%}$
test_tc_init_nested 0.1515ms 62.3082μs 16.0493 KOps/s 17.8798 KOps/s $\textbf{\color{#d91a1a}-10.24\%}$
test_tc_first_layer_tensor 3.9646μs 0.6756μs 1.4802 MOps/s 1.4587 MOps/s $\color{#35bf28}+1.48\%$
test_tc_first_layer_nontensor 2.1335μs 0.6663μs 1.5008 MOps/s 1.5138 MOps/s $\color{#d91a1a}-0.86\%$
test_tc_second_layer_tensor 16.8010μs 1.8574μs 538.3749 KOps/s 552.1394 KOps/s $\color{#d91a1a}-2.49\%$
test_tc_second_layer_nontensor 14.7980μs 1.6459μs 607.5569 KOps/s 671.6788 KOps/s $\textbf{\color{#d91a1a}-9.55\%}$
test_unbind 81.6286ms 7.0817ms 141.2092 Ops/s 131.0674 Ops/s $\textbf{\color{#35bf28}+7.74\%}$
test_full_like 19.4655ms 11.7688ms 84.9704 Ops/s 83.9510 Ops/s $\color{#35bf28}+1.21\%$
test_zeros_like 14.4214ms 5.7658ms 173.4372 Ops/s 172.1134 Ops/s $\color{#35bf28}+0.77\%$
test_ones_like 6.9768ms 6.1388ms 162.8991 Ops/s 157.7602 Ops/s $\color{#35bf28}+3.26\%$
test_clone 8.2621ms 7.8583ms 127.2545 Ops/s 128.0410 Ops/s $\color{#d91a1a}-0.61\%$
test_squeeze 59.9730μs 14.3360μs 69.7547 KOps/s 70.8139 KOps/s $\color{#d91a1a}-1.50\%$
test_unsqueeze 0.1953ms 59.5149μs 16.8025 KOps/s 16.2912 KOps/s $\color{#35bf28}+3.14\%$
test_split 0.1699ms 0.1124ms 8.8995 KOps/s 8.8785 KOps/s $\color{#35bf28}+0.24\%$
test_permute 0.1969ms 0.1249ms 8.0053 KOps/s 7.8086 KOps/s $\color{#35bf28}+2.52\%$
test_stack 28.3265ms 22.0195ms 45.4143 Ops/s 45.6400 Ops/s $\color{#d91a1a}-0.49\%$
test_cat 26.7834ms 21.9120ms 45.6372 Ops/s 45.3475 Ops/s $\color{#35bf28}+0.64\%$

@vmoens vmoens deleted the patch_make_dtype_promotion branch June 27, 2024 12:22
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}28$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 87.5220μs 12.8023μs 78.1109 KOps/s 79.5164 KOps/s $\color{#d91a1a}-1.77\%$
test_plain_set_stack_nested 29.1800μs 12.8068μs 78.0835 KOps/s 80.5393 KOps/s $\color{#d91a1a}-3.05\%$
test_plain_set_nested_inplace 38.2010μs 14.2822μs 70.0174 KOps/s 73.8896 KOps/s $\textbf{\color{#d91a1a}-5.24\%}$
test_plain_set_stack_nested_inplace 36.5410μs 14.1789μs 70.5272 KOps/s 73.2504 KOps/s $\color{#d91a1a}-3.72\%$
test_items 21.4010μs 4.6703μs 214.1182 KOps/s 211.2732 KOps/s $\color{#35bf28}+1.35\%$
test_items_nested 0.3717ms 0.3368ms 2.9691 KOps/s 2.9730 KOps/s $\color{#d91a1a}-0.13\%$
test_items_nested_locked 0.3725ms 0.3367ms 2.9700 KOps/s 2.9240 KOps/s $\color{#35bf28}+1.57\%$
test_items_nested_leaf 0.1084ms 82.5074μs 12.1201 KOps/s 11.9790 KOps/s $\color{#35bf28}+1.18\%$
test_items_stack_nested 0.4305ms 0.3379ms 2.9592 KOps/s 2.8994 KOps/s $\color{#35bf28}+2.06\%$
test_items_stack_nested_leaf 0.1267ms 81.7209μs 12.2368 KOps/s 11.8661 KOps/s $\color{#35bf28}+3.12\%$
test_items_stack_nested_locked 0.3665ms 0.3371ms 2.9666 KOps/s 2.9057 KOps/s $\color{#35bf28}+2.10\%$
test_keys 16.8700μs 4.3639μs 229.1531 KOps/s 226.7030 KOps/s $\color{#35bf28}+1.08\%$
test_keys_nested 96.9120μs 69.3988μs 14.4095 KOps/s 14.8087 KOps/s $\color{#d91a1a}-2.70\%$
test_keys_nested_locked 2.0538ms 74.1114μs 13.4932 KOps/s 13.2865 KOps/s $\color{#35bf28}+1.56\%$
test_keys_nested_leaf 0.1034ms 58.9823μs 16.9542 KOps/s 16.8471 KOps/s $\color{#35bf28}+0.64\%$
test_keys_stack_nested 90.2820μs 66.2229μs 15.1005 KOps/s 14.7450 KOps/s $\color{#35bf28}+2.41\%$
test_keys_stack_nested_leaf 82.9720μs 58.8196μs 17.0011 KOps/s 17.2909 KOps/s $\color{#d91a1a}-1.68\%$
test_keys_stack_nested_locked 91.8220μs 72.2604μs 13.8388 KOps/s 13.5146 KOps/s $\color{#35bf28}+2.40\%$
test_values 7.5537μs 1.8051μs 553.9879 KOps/s 548.5517 KOps/s $\color{#35bf28}+0.99\%$
test_values_nested 56.8020μs 35.6511μs 28.0496 KOps/s 28.0581 KOps/s $\color{#d91a1a}-0.03\%$
test_values_nested_locked 57.4810μs 37.4235μs 26.7212 KOps/s 26.3341 KOps/s $\color{#35bf28}+1.47\%$
test_values_nested_leaf 45.3510μs 31.3722μs 31.8753 KOps/s 31.5837 KOps/s $\color{#35bf28}+0.92\%$
test_values_stack_nested 52.9220μs 35.9871μs 27.7877 KOps/s 27.6820 KOps/s $\color{#35bf28}+0.38\%$
test_values_stack_nested_leaf 50.7710μs 31.8197μs 31.4271 KOps/s 30.9699 KOps/s $\color{#35bf28}+1.48\%$
test_values_stack_nested_locked 57.0810μs 38.0167μs 26.3042 KOps/s 25.9711 KOps/s $\color{#35bf28}+1.28\%$
test_membership 13.9700μs 0.8400μs 1.1905 MOps/s 1.4410 MOps/s $\textbf{\color{#d91a1a}-17.38\%}$
test_membership_nested 18.3600μs 2.5543μs 391.4910 KOps/s 390.9004 KOps/s $\color{#35bf28}+0.15\%$
test_membership_nested_leaf 16.8900μs 2.5395μs 393.7802 KOps/s 387.9449 KOps/s $\color{#35bf28}+1.50\%$
test_membership_stacked_nested 23.9800μs 2.5572μs 391.0500 KOps/s 394.5407 KOps/s $\color{#d91a1a}-0.88\%$
test_membership_stacked_nested_leaf 16.4300μs 2.5985μs 384.8401 KOps/s 394.7629 KOps/s $\color{#d91a1a}-2.51\%$
test_membership_nested_last 16.9310μs 3.1010μs 322.4784 KOps/s 324.2483 KOps/s $\color{#d91a1a}-0.55\%$
test_membership_nested_leaf_last 25.4410μs 3.0604μs 326.7544 KOps/s 325.9315 KOps/s $\color{#35bf28}+0.25\%$
test_membership_stacked_nested_last 34.2110μs 9.6962μs 103.1335 KOps/s 260.5245 KOps/s $\textbf{\color{#d91a1a}-60.41\%}$
test_membership_stacked_nested_leaf_last 28.4210μs 9.6774μs 103.3339 KOps/s 259.0101 KOps/s $\textbf{\color{#d91a1a}-60.10\%}$
test_nested_getleaf 31.0510μs 8.4048μs 118.9796 KOps/s 119.5559 KOps/s $\color{#d91a1a}-0.48\%$
test_nested_get 26.7810μs 7.8611μs 127.2093 KOps/s 126.3257 KOps/s $\color{#35bf28}+0.70\%$
test_stacked_getleaf 25.1500μs 8.3891μs 119.2027 KOps/s 119.6654 KOps/s $\color{#d91a1a}-0.39\%$
test_stacked_get 24.4310μs 7.8258μs 127.7817 KOps/s 127.7921 KOps/s $-0.01\%$
test_nested_getitemleaf 25.8800μs 8.5601μs 116.8217 KOps/s 116.8927 KOps/s $\color{#d91a1a}-0.06\%$
test_nested_getitem 61.7310μs 8.0296μs 124.5399 KOps/s 124.2295 KOps/s $\color{#35bf28}+0.25\%$
test_stacked_getitemleaf 31.8300μs 8.5362μs 117.1482 KOps/s 116.1361 KOps/s $\color{#35bf28}+0.87\%$
test_stacked_getitem 20.5100μs 8.0694μs 123.9252 KOps/s 124.1048 KOps/s $\color{#d91a1a}-0.14\%$
test_lock_nested 57.8097ms 0.4204ms 2.3789 KOps/s 2.4627 KOps/s $\color{#d91a1a}-3.40\%$
test_lock_stack_nested 0.3427ms 0.3057ms 3.2717 KOps/s 3.2412 KOps/s $\color{#35bf28}+0.94\%$
test_unlock_nested 60.0262ms 0.4232ms 2.3632 KOps/s 2.4357 KOps/s $\color{#d91a1a}-2.98\%$
test_unlock_stack_nested 0.3517ms 0.3140ms 3.1850 KOps/s 3.1645 KOps/s $\color{#35bf28}+0.65\%$
test_flatten_speed 0.3605ms 0.1019ms 9.8149 KOps/s 9.8066 KOps/s $\color{#35bf28}+0.09\%$
test_unflatten_speed 0.3745ms 0.2937ms 3.4046 KOps/s 3.3957 KOps/s $\color{#35bf28}+0.26\%$
test_common_ops 1.0675ms 0.6130ms 1.6313 KOps/s 1.7492 KOps/s $\textbf{\color{#d91a1a}-6.74\%}$
test_creation 14.4500μs 1.6394μs 609.9814 KOps/s 620.1143 KOps/s $\color{#d91a1a}-1.63\%$
test_creation_empty 22.8800μs 8.6274μs 115.9104 KOps/s 131.1975 KOps/s $\textbf{\color{#d91a1a}-11.65\%}$
test_creation_nested_1 30.5600μs 10.5084μs 95.1623 KOps/s 104.8683 KOps/s $\textbf{\color{#d91a1a}-9.26\%}$
test_creation_nested_2 26.7800μs 12.6177μs 79.2536 KOps/s 85.9652 KOps/s $\textbf{\color{#d91a1a}-7.81\%}$
test_clone 66.8220μs 12.3254μs 81.1329 KOps/s 83.3710 KOps/s $\color{#d91a1a}-2.68\%$
test_getitem[int] 30.8110μs 11.5169μs 86.8288 KOps/s 91.1464 KOps/s $\color{#d91a1a}-4.74\%$
test_getitem[slice_int] 62.2420μs 22.1884μs 45.0685 KOps/s 47.3698 KOps/s $\color{#d91a1a}-4.86\%$
test_getitem[range] 68.4010μs 49.1430μs 20.3488 KOps/s 19.8928 KOps/s $\color{#35bf28}+2.29\%$
test_getitem[tuple] 43.7310μs 19.5926μs 51.0396 KOps/s 52.1331 KOps/s $\color{#d91a1a}-2.10\%$
test_getitem[list] 0.1317ms 35.9698μs 27.8011 KOps/s 29.2378 KOps/s $\color{#d91a1a}-4.91\%$
test_setitem_dim[int] 62.3710μs 31.0892μs 32.1655 KOps/s 33.8965 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_setitem_dim[slice_int] 84.2620μs 52.2250μs 19.1479 KOps/s 19.8416 KOps/s $\color{#d91a1a}-3.50\%$
test_setitem_dim[range] 0.1003ms 69.7116μs 14.3448 KOps/s 14.7635 KOps/s $\color{#d91a1a}-2.84\%$
test_setitem_dim[tuple] 67.4510μs 45.4694μs 21.9928 KOps/s 22.7867 KOps/s $\color{#d91a1a}-3.48\%$
test_setitem 38.1910μs 17.1573μs 58.2844 KOps/s 61.5869 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_set 52.0620μs 16.7957μs 59.5390 KOps/s 62.3723 KOps/s $\color{#d91a1a}-4.54\%$
test_set_shared 1.6987ms 0.1021ms 9.7912 KOps/s 10.0963 KOps/s $\color{#d91a1a}-3.02\%$
test_update 86.4420μs 19.7633μs 50.5989 KOps/s 55.2623 KOps/s $\textbf{\color{#d91a1a}-8.44\%}$
test_update_nested 73.3120μs 24.8417μs 40.2548 KOps/s 42.4132 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_update__nested 50.9410μs 22.7385μs 43.9784 KOps/s 44.2592 KOps/s $\color{#d91a1a}-0.63\%$
test_set_nested 74.0020μs 18.0084μs 55.5297 KOps/s 58.6926 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_set_nested_new 67.6210μs 20.9563μs 47.7184 KOps/s 50.0010 KOps/s $\color{#d91a1a}-4.57\%$
test_select 66.9910μs 33.4100μs 29.9311 KOps/s 29.6588 KOps/s $\color{#35bf28}+0.92\%$
test_select_nested 0.9297ms 55.1038μs 18.1476 KOps/s 17.9937 KOps/s $\color{#35bf28}+0.85\%$
test_exclude_nested 0.1323ms 0.1104ms 9.0578 KOps/s 9.0208 KOps/s $\color{#35bf28}+0.41\%$
test_empty[True] 0.3937ms 0.3441ms 2.9065 KOps/s 2.8700 KOps/s $\color{#35bf28}+1.27\%$
test_empty[False] 3.0151μs 0.9358μs 1.0686 MOps/s 1.0340 MOps/s $\color{#35bf28}+3.34\%$
test_to 0.1025ms 76.9930μs 12.9882 KOps/s 12.9204 KOps/s $\color{#35bf28}+0.52\%$
test_to_nonblocking 89.5620μs 60.9912μs 16.3958 KOps/s 15.2256 KOps/s $\textbf{\color{#35bf28}+7.69\%}$
test_unbind_speed 0.3069ms 0.2729ms 3.6646 KOps/s 3.7104 KOps/s $\color{#d91a1a}-1.23\%$
test_unbind_speed_stack0 0.3040ms 0.2699ms 3.7049 KOps/s 3.7117 KOps/s $\color{#d91a1a}-0.18\%$
test_unbind_speed_stack1 75.2053ms 0.8093ms 1.2357 KOps/s 1.2283 KOps/s $\color{#35bf28}+0.60\%$
test_split 74.7313ms 1.8115ms 552.0395 Ops/s 585.3615 Ops/s $\textbf{\color{#d91a1a}-5.69\%}$
test_chunk 1.7376ms 1.6812ms 594.7974 Ops/s 632.4595 Ops/s $\textbf{\color{#d91a1a}-5.95\%}$
test_creation[device0] 0.1104ms 59.4827μs 16.8116 KOps/s 17.2986 KOps/s $\color{#d91a1a}-2.82\%$
test_creation_from_tensor 0.1300ms 55.5165μs 18.0127 KOps/s 18.4992 KOps/s $\color{#d91a1a}-2.63\%$
test_add_one[memmap_tensor0] 98.2820μs 7.3204μs 136.6039 KOps/s 140.8750 KOps/s $\color{#d91a1a}-3.03\%$
test_contiguous[memmap_tensor0] 24.5810μs 0.6566μs 1.5230 MOps/s 1.4953 MOps/s $\color{#35bf28}+1.85\%$
test_stack[memmap_tensor0] 46.6310μs 5.6625μs 176.5991 KOps/s 203.6556 KOps/s $\textbf{\color{#d91a1a}-13.29\%}$
test_memmaptd_index 1.1673ms 0.3113ms 3.2121 KOps/s 3.4602 KOps/s $\textbf{\color{#d91a1a}-7.17\%}$
test_memmaptd_index_astensor 0.6912ms 0.3839ms 2.6049 KOps/s 2.5530 KOps/s $\color{#35bf28}+2.03\%$
test_memmaptd_index_op 0.9606ms 0.7017ms 1.4252 KOps/s 1.5305 KOps/s $\textbf{\color{#d91a1a}-6.88\%}$
test_serialize_model 0.1061s 0.1023s 9.7716 Ops/s 8.6348 Ops/s $\textbf{\color{#35bf28}+13.17\%}$
test_serialize_model_pickle 1.3546s 1.2358s 0.8092 Ops/s 0.8082 Ops/s $\color{#35bf28}+0.13\%$
test_serialize_weights 0.1768s 0.1082s 9.2422 Ops/s 8.7898 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_serialize_weights_returnearly 0.3029s 0.1079s 9.2688 Ops/s 10.0837 Ops/s $\textbf{\color{#d91a1a}-8.08\%}$
test_serialize_weights_pickle 1.3557s 1.2480s 0.8013 Ops/s 0.8010 Ops/s $\color{#35bf28}+0.03\%$
test_reshape_pytree 49.2110μs 27.6701μs 36.1401 KOps/s 38.6931 KOps/s $\textbf{\color{#d91a1a}-6.60\%}$
test_reshape_td 61.0210μs 33.3350μs 29.9985 KOps/s 31.7889 KOps/s $\textbf{\color{#d91a1a}-5.63\%}$
test_view_pytree 44.0310μs 27.3235μs 36.5985 KOps/s 38.6034 KOps/s $\textbf{\color{#d91a1a}-5.19\%}$
test_view_td 61.9310μs 38.3145μs 26.0998 KOps/s 27.3027 KOps/s $\color{#d91a1a}-4.41\%$
test_unbind_pytree 49.9510μs 33.0302μs 30.2753 KOps/s 31.2149 KOps/s $\color{#d91a1a}-3.01\%$
test_unbind_td 0.4433ms 42.1747μs 23.7109 KOps/s 22.9616 KOps/s $\color{#35bf28}+3.26\%$
test_split_pytree 61.2320μs 36.2210μs 27.6083 KOps/s 29.3036 KOps/s $\textbf{\color{#d91a1a}-5.79\%}$
test_split_td 0.1127ms 43.0764μs 23.2146 KOps/s 25.5578 KOps/s $\textbf{\color{#d91a1a}-9.17\%}$
test_add_pytree 70.3820μs 39.5575μs 25.2797 KOps/s 26.1169 KOps/s $\color{#d91a1a}-3.21\%$
test_add_td 78.8920μs 53.8155μs 18.5820 KOps/s 19.0353 KOps/s $\color{#d91a1a}-2.38\%$
test_distributed 1.8195ms 85.9707μs 11.6319 KOps/s 13.6805 KOps/s $\textbf{\color{#d91a1a}-14.97\%}$
test_tdmodule 44.0910μs 15.0152μs 66.5994 KOps/s 68.6932 KOps/s $\color{#d91a1a}-3.05\%$
test_tdmodule_dispatch 49.4510μs 29.5622μs 33.8270 KOps/s 36.3974 KOps/s $\textbf{\color{#d91a1a}-7.06\%}$
test_tdseq 32.3710μs 16.6628μs 60.0140 KOps/s 61.7432 KOps/s $\color{#d91a1a}-2.80\%$
test_tdseq_dispatch 53.4910μs 32.1934μs 31.0623 KOps/s 32.1156 KOps/s $\color{#d91a1a}-3.28\%$
test_instantiation_functorch 1.5034ms 1.4418ms 693.5560 Ops/s 712.8085 Ops/s $\color{#d91a1a}-2.70\%$
test_instantiation_td 1.5004ms 1.0004ms 999.5900 Ops/s 1.0254 KOps/s $\color{#d91a1a}-2.51\%$
test_exec_functorch 0.1753ms 0.1515ms 6.5988 KOps/s 6.8129 KOps/s $\color{#d91a1a}-3.14\%$
test_exec_functional_call 0.1813ms 0.1446ms 6.9137 KOps/s 6.9902 KOps/s $\color{#d91a1a}-1.09\%$
test_exec_td 0.1834ms 0.1436ms 6.9646 KOps/s 7.0066 KOps/s $\color{#d91a1a}-0.60\%$
test_exec_td_decorator 0.5355ms 0.2140ms 4.6720 KOps/s 4.6785 KOps/s $\color{#d91a1a}-0.14\%$
test_vmap_mlp_speed[True-True] 0.6374ms 0.5824ms 1.7170 KOps/s 1.6704 KOps/s $\color{#35bf28}+2.79\%$
test_vmap_mlp_speed[True-False] 0.6613ms 0.5803ms 1.7232 KOps/s 1.6559 KOps/s $\color{#35bf28}+4.07\%$
test_vmap_mlp_speed[False-True] 0.5693ms 0.5097ms 1.9619 KOps/s 1.8690 KOps/s $\color{#35bf28}+4.97\%$
test_vmap_mlp_speed[False-False] 0.5569ms 0.5098ms 1.9614 KOps/s 1.8936 KOps/s $\color{#35bf28}+3.58\%$
test_vmap_mlp_speed_decorator[True-True] 1.0885ms 0.6450ms 1.5503 KOps/s 1.4691 KOps/s $\textbf{\color{#35bf28}+5.53\%}$
test_vmap_mlp_speed_decorator[True-False] 0.7662ms 0.6436ms 1.5538 KOps/s 1.5620 KOps/s $\color{#d91a1a}-0.52\%$
test_vmap_mlp_speed_decorator[False-True] 0.7182ms 0.5695ms 1.7558 KOps/s 1.7588 KOps/s $\color{#d91a1a}-0.17\%$
test_vmap_mlp_speed_decorator[False-False] 0.7267ms 0.5665ms 1.7652 KOps/s 1.7594 KOps/s $\color{#35bf28}+0.33\%$
test_vmap_transformer_speed[True-True] 7.8220ms 7.6796ms 130.2149 Ops/s 130.4374 Ops/s $\color{#d91a1a}-0.17\%$
test_vmap_transformer_speed[True-False] 8.1165ms 7.7405ms 129.1902 Ops/s 131.0419 Ops/s $\color{#d91a1a}-1.41\%$
test_vmap_transformer_speed[False-True] 7.7088ms 7.6076ms 131.4467 Ops/s 126.2359 Ops/s $\color{#35bf28}+4.13\%$
test_vmap_transformer_speed[False-False] 7.7216ms 7.5961ms 131.6459 Ops/s 127.6240 Ops/s $\color{#35bf28}+3.15\%$
test_vmap_transformer_speed_decorator[True-True] 18.7468ms 18.6074ms 53.7421 Ops/s 52.3667 Ops/s $\color{#35bf28}+2.63\%$
test_vmap_transformer_speed_decorator[True-False] 18.8321ms 18.6696ms 53.5630 Ops/s 52.5522 Ops/s $\color{#35bf28}+1.92\%$
test_vmap_transformer_speed_decorator[False-True] 19.4536ms 18.5460ms 53.9198 Ops/s 52.4877 Ops/s $\color{#35bf28}+2.73\%$
test_vmap_transformer_speed_decorator[False-False] 20.1042ms 18.7696ms 53.2776 Ops/s 53.0530 Ops/s $\color{#35bf28}+0.42\%$
test_to_module_speed[True] 1.7542ms 1.4968ms 668.0759 Ops/s 654.8441 Ops/s $\color{#35bf28}+2.02\%$
test_to_module_speed[False] 1.6729ms 1.4815ms 674.9688 Ops/s 660.7338 Ops/s $\color{#35bf28}+2.15\%$
test_tc_init 48.4120μs 24.6867μs 40.5077 KOps/s 43.7161 KOps/s $\textbf{\color{#d91a1a}-7.34\%}$
test_tc_init_nested 83.1820μs 45.8276μs 21.8209 KOps/s 21.0027 KOps/s $\color{#35bf28}+3.90\%$
test_tc_first_layer_tensor 5.1586μs 0.3565μs 2.8053 MOps/s 2.7839 MOps/s $\color{#35bf28}+0.77\%$
test_tc_first_layer_nontensor 15.3003μs 0.3836μs 2.6069 MOps/s 2.5991 MOps/s $\color{#35bf28}+0.30\%$
test_tc_second_layer_tensor 22.3800μs 1.0637μs 940.1427 KOps/s 942.9432 KOps/s $\color{#d91a1a}-0.30\%$
test_tc_second_layer_nontensor 34.0325μs 0.8144μs 1.2279 MOps/s 1.2014 MOps/s $\color{#35bf28}+2.21\%$
test_unbind 0.1052s 7.9783ms 125.3393 Ops/s 129.8900 Ops/s $\color{#d91a1a}-3.50\%$
test_full_like 11.2983ms 11.0747ms 90.2962 Ops/s 76.9203 Ops/s $\textbf{\color{#35bf28}+17.39\%}$
test_zeros_like 8.4412ms 7.9443ms 125.8757 Ops/s 127.9243 Ops/s $\color{#d91a1a}-1.60\%$
test_ones_like 8.2371ms 7.9255ms 126.1747 Ops/s 127.8955 Ops/s $\color{#d91a1a}-1.35\%$
test_clone 9.4028ms 9.2478ms 108.1337 Ops/s 108.3867 Ops/s $\color{#d91a1a}-0.23\%$
test_squeeze 60.5520μs 11.3446μs 88.1479 KOps/s 91.6767 KOps/s $\color{#d91a1a}-3.85\%$
test_unsqueeze 97.2220μs 53.1403μs 18.8181 KOps/s 19.0047 KOps/s $\color{#d91a1a}-0.98\%$
test_split 0.1590ms 0.1013ms 9.8670 KOps/s 10.1907 KOps/s $\color{#d91a1a}-3.18\%$
test_permute 0.1604ms 0.1176ms 8.5000 KOps/s 9.0512 KOps/s $\textbf{\color{#d91a1a}-6.09\%}$
test_stack 26.9449ms 26.7960ms 37.3190 Ops/s 37.5533 Ops/s $\color{#d91a1a}-0.62\%$
test_cat 27.1108ms 26.7450ms 37.3902 Ops/s 37.6083 Ops/s $\color{#d91a1a}-0.58\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants