Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix unitary ops for tensorclass #1164

Merged
merged 1 commit into from
Jan 7, 2025

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 7, 2025

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 7, 2025
Copy link

github-actions bot commented Jan 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}40$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 45.9160μs 21.5347μs 46.4366 KOps/s 48.7287 KOps/s $\color{#d91a1a}-4.70\%$
test_plain_set_stack_nested 57.0570μs 21.5723μs 46.3558 KOps/s 48.6383 KOps/s $\color{#d91a1a}-4.69\%$
test_plain_set_nested_inplace 73.6680μs 23.5956μs 42.3808 KOps/s 44.9699 KOps/s $\textbf{\color{#d91a1a}-5.76\%}$
test_plain_set_stack_nested_inplace 70.0580μs 23.3657μs 42.7978 KOps/s 45.5762 KOps/s $\textbf{\color{#d91a1a}-6.10\%}$
test_items 23.6150μs 4.2643μs 234.5036 KOps/s 242.1309 KOps/s $\color{#d91a1a}-3.15\%$
test_items_nested 0.8422ms 0.4019ms 2.4881 KOps/s 2.4851 KOps/s $\color{#35bf28}+0.12\%$
test_items_nested_locked 0.5562ms 0.4017ms 2.4891 KOps/s 2.4955 KOps/s $\color{#d91a1a}-0.26\%$
test_items_nested_leaf 0.1374ms 77.7329μs 12.8646 KOps/s 12.5505 KOps/s $\color{#35bf28}+2.50\%$
test_items_stack_nested 0.5415ms 0.4040ms 2.4751 KOps/s 2.4692 KOps/s $\color{#35bf28}+0.24\%$
test_items_stack_nested_leaf 0.1439ms 80.4928μs 12.4235 KOps/s 12.6349 KOps/s $\color{#d91a1a}-1.67\%$
test_items_stack_nested_locked 0.5771ms 0.4034ms 2.4788 KOps/s 2.4689 KOps/s $\color{#35bf28}+0.40\%$
test_keys 42.4890μs 3.9302μs 254.4393 KOps/s 283.4070 KOps/s $\textbf{\color{#d91a1a}-10.22\%}$
test_keys_nested 0.2673ms 0.1645ms 6.0809 KOps/s 5.9793 KOps/s $\color{#35bf28}+1.70\%$
test_keys_nested_locked 0.7049ms 0.1728ms 5.7879 KOps/s 5.7798 KOps/s $\color{#35bf28}+0.14\%$
test_keys_nested_leaf 0.2002ms 0.1439ms 6.9486 KOps/s 6.8030 KOps/s $\color{#35bf28}+2.14\%$
test_keys_stack_nested 0.2955ms 0.1635ms 6.1167 KOps/s 5.9742 KOps/s $\color{#35bf28}+2.39\%$
test_keys_stack_nested_leaf 0.2386ms 0.1421ms 7.0358 KOps/s 6.9128 KOps/s $\color{#35bf28}+1.78\%$
test_keys_stack_nested_locked 0.3280ms 0.1689ms 5.9198 KOps/s 5.6847 KOps/s $\color{#35bf28}+4.14\%$
test_values 8.3556μs 1.0398μs 961.7021 KOps/s 956.4555 KOps/s $\color{#35bf28}+0.55\%$
test_values_nested 0.1470ms 64.3737μs 15.5343 KOps/s 15.8201 KOps/s $\color{#d91a1a}-1.81\%$
test_values_nested_locked 0.1154ms 63.7035μs 15.6977 KOps/s 15.3685 KOps/s $\color{#35bf28}+2.14\%$
test_values_nested_leaf 0.1599ms 73.0381μs 13.6915 KOps/s 13.8578 KOps/s $\color{#d91a1a}-1.20\%$
test_values_stack_nested 0.1109ms 64.9679μs 15.3922 KOps/s 15.7209 KOps/s $\color{#d91a1a}-2.09\%$
test_values_stack_nested_leaf 0.1385ms 73.0556μs 13.6882 KOps/s 13.7864 KOps/s $\color{#d91a1a}-0.71\%$
test_values_stack_nested_locked 0.1163ms 64.8984μs 15.4087 KOps/s 15.7904 KOps/s $\color{#d91a1a}-2.42\%$
test_membership 4.7160μs 0.7192μs 1.3904 MOps/s 1.3574 MOps/s $\color{#35bf28}+2.43\%$
test_membership_nested 33.2420μs 2.8777μs 347.5045 KOps/s 335.4113 KOps/s $\color{#35bf28}+3.61\%$
test_membership_nested_leaf 42.7600μs 2.9247μs 341.9186 KOps/s 346.2525 KOps/s $\color{#d91a1a}-1.25\%$
test_membership_stacked_nested 27.8220μs 2.9174μs 342.7656 KOps/s 342.9114 KOps/s $\color{#d91a1a}-0.04\%$
test_membership_stacked_nested_leaf 33.4030μs 2.9000μs 344.8218 KOps/s 340.0150 KOps/s $\color{#35bf28}+1.41\%$
test_membership_nested_last 31.5600μs 4.3187μs 231.5511 KOps/s 231.6628 KOps/s $\color{#d91a1a}-0.05\%$
test_membership_nested_leaf_last 30.8780μs 4.3681μs 228.9329 KOps/s 230.5081 KOps/s $\color{#d91a1a}-0.68\%$
test_membership_stacked_nested_last 33.5730μs 7.0679μs 141.4845 KOps/s 194.0921 KOps/s $\textbf{\color{#d91a1a}-27.10\%}$
test_membership_stacked_nested_leaf_last 37.7710μs 7.0140μs 142.5720 KOps/s 195.4212 KOps/s $\textbf{\color{#d91a1a}-27.04\%}$
test_nested_getleaf 47.0180μs 10.7790μs 92.7729 KOps/s 93.0242 KOps/s $\color{#d91a1a}-0.27\%$
test_nested_get 42.7400μs 10.1160μs 98.8537 KOps/s 98.4610 KOps/s $\color{#35bf28}+0.40\%$
test_stacked_getleaf 41.2870μs 10.8084μs 92.5209 KOps/s 92.9894 KOps/s $\color{#d91a1a}-0.50\%$
test_stacked_get 33.5230μs 10.1399μs 98.6203 KOps/s 98.8342 KOps/s $\color{#d91a1a}-0.22\%$
test_nested_getitemleaf 50.3640μs 11.1105μs 90.0047 KOps/s 88.7372 KOps/s $\color{#35bf28}+1.43\%$
test_nested_getitem 36.6390μs 10.4825μs 95.3966 KOps/s 94.6494 KOps/s $\color{#35bf28}+0.79\%$
test_stacked_getitemleaf 36.1080μs 11.0526μs 90.4768 KOps/s 89.2396 KOps/s $\color{#35bf28}+1.39\%$
test_stacked_getitem 45.6660μs 10.4680μs 95.5296 KOps/s 96.5254 KOps/s $\color{#d91a1a}-1.03\%$
test_lock_nested 0.9208ms 0.4573ms 2.1866 KOps/s 2.2306 KOps/s $\color{#d91a1a}-1.97\%$
test_lock_stack_nested 0.6606ms 0.4273ms 2.3405 KOps/s 2.3594 KOps/s $\color{#d91a1a}-0.80\%$
test_unlock_nested 0.7811ms 0.3789ms 2.6393 KOps/s 2.6971 KOps/s $\color{#d91a1a}-2.14\%$
test_unlock_stack_nested 0.5276ms 0.3432ms 2.9136 KOps/s 2.9307 KOps/s $\color{#d91a1a}-0.58\%$
test_flatten_speed 0.3060ms 0.1059ms 9.4438 KOps/s 10.0688 KOps/s $\textbf{\color{#d91a1a}-6.21\%}$
test_unflatten_speed 1.2180ms 0.5331ms 1.8757 KOps/s 1.8836 KOps/s $\color{#d91a1a}-0.42\%$
test_common_ops 2.0489ms 0.8411ms 1.1889 KOps/s 1.3270 KOps/s $\textbf{\color{#d91a1a}-10.41\%}$
test_creation 19.3860μs 2.5396μs 393.7627 KOps/s 354.3792 KOps/s $\textbf{\color{#35bf28}+11.11\%}$
test_creation_empty 41.2870μs 13.0814μs 76.4442 KOps/s 90.8394 KOps/s $\textbf{\color{#d91a1a}-15.85\%}$
test_creation_nested_1 60.9240μs 16.0096μs 62.4627 KOps/s 71.4434 KOps/s $\textbf{\color{#d91a1a}-12.57\%}$
test_creation_nested_2 60.4330μs 20.3441μs 49.1543 KOps/s 54.7174 KOps/s $\textbf{\color{#d91a1a}-10.17\%}$
test_clone 68.9590μs 13.9708μs 71.5777 KOps/s 75.0323 KOps/s $\color{#d91a1a}-4.60\%$
test_getitem[int] 1.2968ms 13.0549μs 76.5998 KOps/s 77.4360 KOps/s $\color{#d91a1a}-1.08\%$
test_getitem[slice_int] 0.1400ms 25.1611μs 39.7439 KOps/s 40.9997 KOps/s $\color{#d91a1a}-3.06\%$
test_getitem[range] 0.1716ms 51.2746μs 19.5029 KOps/s 20.8608 KOps/s $\textbf{\color{#d91a1a}-6.51\%}$
test_getitem[tuple] 0.1528ms 20.5354μs 48.6965 KOps/s 49.7644 KOps/s $\color{#d91a1a}-2.15\%$
test_getitem[list] 0.2454ms 46.7496μs 21.3906 KOps/s 22.9559 KOps/s $\textbf{\color{#d91a1a}-6.82\%}$
test_setitem_dim[int] 58.0390μs 26.3207μs 37.9929 KOps/s 39.8511 KOps/s $\color{#d91a1a}-4.66\%$
test_setitem_dim[slice_int] 90.2690μs 52.1139μs 19.1888 KOps/s 19.7096 KOps/s $\color{#d91a1a}-2.64\%$
test_setitem_dim[range] 0.1330ms 74.8685μs 13.3567 KOps/s 13.7820 KOps/s $\color{#d91a1a}-3.09\%$
test_setitem_dim[tuple] 77.3650μs 41.2586μs 24.2374 KOps/s 24.9901 KOps/s $\color{#d91a1a}-3.01\%$
test_setitem 68.9490μs 22.3723μs 44.6980 KOps/s 51.1501 KOps/s $\textbf{\color{#d91a1a}-12.61\%}$
test_set 0.2297ms 21.8401μs 45.7874 KOps/s 51.9641 KOps/s $\textbf{\color{#d91a1a}-11.89\%}$
test_set_shared 1.1692ms 0.1686ms 5.9303 KOps/s 5.8341 KOps/s $\color{#35bf28}+1.65\%$
test_update 0.2100ms 25.4277μs 39.3273 KOps/s 45.4886 KOps/s $\textbf{\color{#d91a1a}-13.54\%}$
test_update_nested 0.1857ms 36.5535μs 27.3571 KOps/s 31.1857 KOps/s $\textbf{\color{#d91a1a}-12.28\%}$
test_update__nested 0.7188ms 35.5680μs 28.1152 KOps/s 29.7958 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_set_nested 0.1318ms 23.9003μs 41.8405 KOps/s 47.2281 KOps/s $\textbf{\color{#d91a1a}-11.41\%}$
test_set_nested_new 0.1547ms 28.6634μs 34.8876 KOps/s 38.4418 KOps/s $\textbf{\color{#d91a1a}-9.25\%}$
test_select 95.0180μs 44.2952μs 22.5758 KOps/s 23.7434 KOps/s $\color{#d91a1a}-4.92\%$
test_select_nested 0.1472ms 62.5114μs 15.9971 KOps/s 15.9159 KOps/s $\color{#35bf28}+0.51\%$
test_exclude_nested 0.1715ms 80.5936μs 12.4079 KOps/s 12.2701 KOps/s $\color{#35bf28}+1.12\%$
test_empty[True] 0.7411ms 0.4126ms 2.4237 KOps/s 2.4340 KOps/s $\color{#d91a1a}-0.42\%$
test_empty[False] 12.9197μs 1.3802μs 724.5389 KOps/s 713.9013 KOps/s $\color{#35bf28}+1.49\%$
test_unbind_speed 0.5727ms 0.2679ms 3.7323 KOps/s 3.6984 KOps/s $\color{#35bf28}+0.92\%$
test_unbind_speed_stack0 0.4098ms 0.2636ms 3.7932 KOps/s 3.7603 KOps/s $\color{#35bf28}+0.87\%$
test_unbind_speed_stack1 97.1881ms 0.7752ms 1.2899 KOps/s 1.3741 KOps/s $\textbf{\color{#d91a1a}-6.13\%}$
test_split 1.7245ms 1.5980ms 625.7881 Ops/s 570.7525 Ops/s $\textbf{\color{#35bf28}+9.64\%}$
test_chunk 0.1011s 1.9399ms 515.5011 Ops/s 566.9413 Ops/s $\textbf{\color{#d91a1a}-9.07\%}$
test_consolidate_njt[False-None] 8.7853ms 8.1935ms 122.0475 Ops/s 122.9360 Ops/s $\color{#d91a1a}-0.72\%$
test_creation[device0] 4.1043ms 92.3630μs 10.8268 KOps/s 11.0654 KOps/s $\color{#d91a1a}-2.16\%$
test_creation_from_tensor 0.2553ms 92.9740μs 10.7557 KOps/s 10.5423 KOps/s $\color{#35bf28}+2.02\%$
test_add_one[memmap_tensor0] 0.2091ms 5.1132μs 195.5711 KOps/s 209.3116 KOps/s $\textbf{\color{#d91a1a}-6.56\%}$
test_contiguous[memmap_tensor0] 19.4560μs 0.5160μs 1.9378 MOps/s 1.9454 MOps/s $\color{#d91a1a}-0.39\%$
test_stack[memmap_tensor0] 28.0930μs 3.4068μs 293.5309 KOps/s 303.7055 KOps/s $\color{#d91a1a}-3.35\%$
test_memmaptd_index 0.9609ms 0.2478ms 4.0348 KOps/s 4.1473 KOps/s $\color{#d91a1a}-2.71\%$
test_memmaptd_index_astensor 0.8328ms 0.3370ms 2.9675 KOps/s 3.0311 KOps/s $\color{#d91a1a}-2.10\%$
test_memmaptd_index_op 1.1723ms 0.6387ms 1.5657 KOps/s 1.7584 KOps/s $\textbf{\color{#d91a1a}-10.96\%}$
test_serialize_model 0.1247s 0.1142s 8.7548 Ops/s 8.5664 Ops/s $\color{#35bf28}+2.20\%$
test_serialize_model_pickle 0.4430s 0.3858s 2.5921 Ops/s 2.5505 Ops/s $\color{#35bf28}+1.63\%$
test_serialize_weights 0.2137s 0.1282s 7.7978 Ops/s 7.7421 Ops/s $\color{#35bf28}+0.72\%$
test_serialize_weights_returnearly 0.1733s 0.1593s 6.2769 Ops/s 6.6238 Ops/s $\textbf{\color{#d91a1a}-5.24\%}$
test_serialize_weights_pickle 0.5405s 0.4513s 2.2158 Ops/s 1.1100 Ops/s $\textbf{\color{#35bf28}+99.62\%}$
test_serialize_weights_filesystem 0.1431s 0.1393s 7.1795 Ops/s 7.1615 Ops/s $\color{#35bf28}+0.25\%$
test_serialize_model_filesystem 0.2337s 0.1557s 6.4207 Ops/s 6.4001 Ops/s $\color{#35bf28}+0.32\%$
test_reshape_pytree 60.5230μs 26.3098μs 38.0086 KOps/s 37.8387 KOps/s $\color{#35bf28}+0.45\%$
test_reshape_td 82.3840μs 33.7398μs 29.6386 KOps/s 30.4706 KOps/s $\color{#d91a1a}-2.73\%$
test_view_pytree 82.0640μs 26.4710μs 37.7772 KOps/s 37.6683 KOps/s $\color{#35bf28}+0.29\%$
test_view_td 82.0440μs 39.3637μs 25.4041 KOps/s 25.9240 KOps/s $\color{#d91a1a}-2.01\%$
test_unbind_pytree 83.3960μs 29.9709μs 33.3657 KOps/s 33.8430 KOps/s $\color{#d91a1a}-1.41\%$
test_unbind_td 0.3551ms 39.4115μs 25.3733 KOps/s 25.3580 KOps/s $\color{#35bf28}+0.06\%$
test_split_pytree 75.6910μs 29.1812μs 34.2687 KOps/s 34.3841 KOps/s $\color{#d91a1a}-0.34\%$
test_split_td 0.5176ms 45.3547μs 22.0484 KOps/s 21.9783 KOps/s $\color{#35bf28}+0.32\%$
test_add_pytree 79.8390μs 35.7160μs 27.9987 KOps/s 28.8904 KOps/s $\color{#d91a1a}-3.09\%$
test_add_td 0.1300ms 62.2248μs 16.0708 KOps/s 17.8986 KOps/s $\textbf{\color{#d91a1a}-10.21\%}$
test_compile_add_one_nested[tensordict-compile] 0.1207ms 61.7820μs 16.1859 KOps/s 16.7744 KOps/s $\color{#d91a1a}-3.51\%$
test_compile_add_one_nested[tensordict-eager] 0.3621ms 0.1707ms 5.8580 KOps/s 5.8768 KOps/s $\color{#d91a1a}-0.32\%$
test_compile_add_one_nested[pytree-compile] 0.1025ms 45.3846μs 22.0339 KOps/s 22.5176 KOps/s $\color{#d91a1a}-2.15\%$
test_compile_add_one_nested[pytree-eager] 0.4079ms 0.1201ms 8.3291 KOps/s 8.4607 KOps/s $\color{#d91a1a}-1.56\%$
test_compile_copy_nested[tensordict-compile] 66.3540μs 25.7245μs 38.8734 KOps/s 39.7519 KOps/s $\color{#d91a1a}-2.21\%$
test_compile_copy_nested[tensordict-eager] 0.1098ms 58.6877μs 17.0393 KOps/s 17.3810 KOps/s $\color{#d91a1a}-1.97\%$
test_compile_copy_nested[pytree-compile] 0.1651ms 78.0993μs 12.8042 KOps/s 12.6305 KOps/s $\color{#35bf28}+1.38\%$
test_compile_copy_nested[pytree-eager] 0.2620ms 67.2934μs 14.8603 KOps/s 14.5013 KOps/s $\color{#35bf28}+2.48\%$
test_compile_add_one_flat[tensordict-compile] 0.1823ms 0.1040ms 9.6143 KOps/s 9.6180 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_add_one_flat[tensordict-eager] 0.4827ms 0.2165ms 4.6182 KOps/s 4.6168 KOps/s $\color{#35bf28}+0.03\%$
test_compile_add_one_flat[tensorclass-compile] 97.7740μs 44.6002μs 22.4214 KOps/s 23.3097 KOps/s $\color{#d91a1a}-3.81\%$
test_compile_add_one_flat[tensorclass-eager] 0.5078ms 66.1122μs 15.1258 KOps/s 15.7837 KOps/s $\color{#d91a1a}-4.17\%$
test_compile_add_one_flat[pytree-compile] 0.2197ms 0.1032ms 9.6907 KOps/s 9.8218 KOps/s $\color{#d91a1a}-1.34\%$
test_compile_add_one_flat[pytree-eager] 0.6620ms 0.2004ms 4.9909 KOps/s 5.0155 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_add_self_flat[tensordict-eager] 0.4371ms 0.2352ms 4.2509 KOps/s 4.2811 KOps/s $\color{#d91a1a}-0.71\%$
test_compile_add_self_flat[tensordict-compile] 0.4004ms 0.1111ms 9.0031 KOps/s 9.6507 KOps/s $\textbf{\color{#d91a1a}-6.71\%}$
test_compile_add_self_flat[tensorclass-eager] 0.2694ms 61.7303μs 16.1995 KOps/s 17.4790 KOps/s $\textbf{\color{#d91a1a}-7.32\%}$
test_compile_add_self_flat[tensorclass-compile] 0.1699ms 45.5328μs 21.9622 KOps/s 22.7918 KOps/s $\color{#d91a1a}-3.64\%$
test_compile_add_self_flat[pytree-eager] 0.6018ms 0.1566ms 6.3864 KOps/s 6.2409 KOps/s $\color{#35bf28}+2.33\%$
test_compile_add_self_flat[pytree-compile] 0.4189ms 0.1032ms 9.6859 KOps/s 9.6991 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_copy_flat[tensordict-compile] 56.7070μs 21.1680μs 47.2411 KOps/s 46.2371 KOps/s $\color{#35bf28}+2.17\%$
test_compile_copy_flat[tensordict-eager] 0.1505ms 66.7893μs 14.9725 KOps/s 15.1008 KOps/s $\color{#d91a1a}-0.85\%$
test_compile_copy_flat[pytree-compile] 0.1595ms 78.5067μs 12.7378 KOps/s 12.6608 KOps/s $\color{#35bf28}+0.61\%$
test_compile_copy_flat[pytree-eager] 0.1405ms 66.5822μs 15.0190 KOps/s 14.6611 KOps/s $\color{#35bf28}+2.44\%$
test_compile_assign_and_add[tensordict-compile] 0.3063ms 0.2043ms 4.8942 KOps/s 4.8532 KOps/s $\color{#35bf28}+0.85\%$
test_compile_assign_and_add[tensordict-eager] 1.5127ms 1.3095ms 763.6507 Ops/s 747.6470 Ops/s $\color{#35bf28}+2.14\%$
test_compile_assign_and_add[pytree-compile] 0.2843ms 0.2023ms 4.9429 KOps/s 4.9467 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_assign_and_add[pytree-eager] 1.3826ms 0.7722ms 1.2951 KOps/s 1.3008 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_assign_and_add_stack[compile] 0.8197ms 0.4654ms 2.1489 KOps/s 2.1825 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_assign_and_add_stack[eager] 2.9731ms 2.8395ms 352.1694 Ops/s 380.1681 Ops/s $\textbf{\color{#d91a1a}-7.36\%}$
test_compile_indexing[tensor-tensordict-compile] 93.2450μs 35.9194μs 27.8401 KOps/s 29.1301 KOps/s $\color{#d91a1a}-4.43\%$
test_compile_indexing[tensor-tensordict-eager] 0.7802ms 34.6507μs 28.8594 KOps/s 30.5642 KOps/s $\textbf{\color{#d91a1a}-5.58\%}$
test_compile_indexing[tensor-tensorclass-compile] 74.4990μs 28.9757μs 34.5116 KOps/s 35.0006 KOps/s $\color{#d91a1a}-1.40\%$
test_compile_indexing[tensor-tensorclass-eager] 64.5910μs 23.3219μs 42.8782 KOps/s 43.0911 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_indexing[tensor-pytree-compile] 92.2630μs 30.2134μs 33.0979 KOps/s 33.7431 KOps/s $\color{#d91a1a}-1.91\%$
test_compile_indexing[tensor-pytree-eager] 91.4110μs 22.8565μs 43.7512 KOps/s 43.9227 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_indexing[slice-tensordict-compile] 0.1136ms 50.7104μs 19.7198 KOps/s 19.5846 KOps/s $\color{#35bf28}+0.69\%$
test_compile_indexing[slice-tensordict-eager] 0.3884ms 20.2524μs 49.3769 KOps/s 49.2554 KOps/s $\color{#35bf28}+0.25\%$
test_compile_indexing[slice-tensorclass-compile] 96.8210μs 43.8131μs 22.8242 KOps/s 22.7377 KOps/s $\color{#35bf28}+0.38\%$
test_compile_indexing[slice-tensorclass-eager] 64.4810μs 18.9792μs 52.6891 KOps/s 53.4690 KOps/s $\color{#d91a1a}-1.46\%$
test_compile_indexing[slice-pytree-compile] 0.1333ms 44.8176μs 22.3127 KOps/s 22.3032 KOps/s $\color{#35bf28}+0.04\%$
test_compile_indexing[slice-pytree-eager] 81.3220μs 18.4043μs 54.3350 KOps/s 52.7068 KOps/s $\color{#35bf28}+3.09\%$
test_compile_indexing[int-tensordict-compile] 0.1420ms 52.4638μs 19.0608 KOps/s 19.1199 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_indexing[int-tensordict-eager] 1.1865ms 20.5354μs 48.6963 KOps/s 48.9697 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_indexing[int-tensorclass-compile] 93.8760μs 44.7127μs 22.3650 KOps/s 22.0618 KOps/s $\color{#35bf28}+1.37\%$
test_compile_indexing[int-tensorclass-eager] 84.9800μs 18.5362μs 53.9484 KOps/s 53.9186 KOps/s $\color{#35bf28}+0.06\%$
test_compile_indexing[int-pytree-compile] 0.1211ms 44.9389μs 22.2525 KOps/s 22.4236 KOps/s $\color{#d91a1a}-0.76\%$
test_compile_indexing[int-pytree-eager] 78.8080μs 19.9540μs 50.1153 KOps/s 54.0156 KOps/s $\textbf{\color{#d91a1a}-7.22\%}$
test_mod_add[eager] 99.7470μs 36.2041μs 27.6212 KOps/s 29.7191 KOps/s $\textbf{\color{#d91a1a}-7.06\%}$
test_mod_add[compile] 0.1239ms 47.0391μs 21.2589 KOps/s 21.4165 KOps/s $\color{#d91a1a}-0.74\%$
test_mod_add[compile-overhead] 0.1165ms 46.6396μs 21.4410 KOps/s 21.3739 KOps/s $\color{#35bf28}+0.31\%$
test_mod_wrap[eager] 0.4957ms 0.2217ms 4.5105 KOps/s 4.5306 KOps/s $\color{#d91a1a}-0.44\%$
test_mod_wrap[compile] 0.3779ms 0.2040ms 4.9025 KOps/s 4.7010 KOps/s $\color{#35bf28}+4.29\%$
test_mod_wrap[compile-overhead] 0.3794ms 0.2041ms 4.8998 KOps/s 4.8180 KOps/s $\color{#35bf28}+1.70\%$
test_mod_wrap_and_backward[eager] 19.7789ms 13.4508ms 74.3450 Ops/s 91.2975 Ops/s $\textbf{\color{#d91a1a}-18.57\%}$
test_mod_wrap_and_backward[compile] 17.9690ms 13.7586ms 72.6816 Ops/s 91.3368 Ops/s $\textbf{\color{#d91a1a}-20.42\%}$
test_mod_wrap_and_backward[compile-overhead] 16.1809ms 12.7986ms 78.1336 Ops/s 90.0115 Ops/s $\textbf{\color{#d91a1a}-13.20\%}$
test_seq_add[eager] 0.2535ms 0.1190ms 8.4064 KOps/s 9.0148 KOps/s $\textbf{\color{#d91a1a}-6.75\%}$
test_seq_add[compile] 0.1436ms 62.9323μs 15.8901 KOps/s 16.6019 KOps/s $\color{#d91a1a}-4.29\%$
test_seq_add[compile-overhead] 0.1323ms 60.3165μs 16.5792 KOps/s 17.0709 KOps/s $\color{#d91a1a}-2.88\%$
test_seq_wrap[eager] 0.7574ms 0.4453ms 2.2455 KOps/s 2.2775 KOps/s $\color{#d91a1a}-1.40\%$
test_seq_wrap[compile] 0.4276ms 0.2288ms 4.3702 KOps/s 4.3542 KOps/s $\color{#35bf28}+0.37\%$
test_seq_wrap[compile-overhead] 0.3624ms 0.2263ms 4.4199 KOps/s 4.3979 KOps/s $\color{#35bf28}+0.50\%$
test_func_call_runtime[False-eager] 0.9382ms 0.5495ms 1.8197 KOps/s 1.8877 KOps/s $\color{#d91a1a}-3.60\%$
test_func_call_runtime[False-compile] 0.8049ms 0.4349ms 2.2993 KOps/s 2.3602 KOps/s $\color{#d91a1a}-2.58\%$
test_func_call_runtime[False-compile-overhead] 0.8050ms 0.4362ms 2.2924 KOps/s 2.3540 KOps/s $\color{#d91a1a}-2.62\%$
test_func_call_runtime[True-eager] 0.9598ms 0.7739ms 1.2921 KOps/s 1.3224 KOps/s $\color{#d91a1a}-2.29\%$
test_func_call_runtime[True-compile] 0.8744ms 0.4769ms 2.0971 KOps/s 2.1481 KOps/s $\color{#d91a1a}-2.38\%$
test_func_call_runtime[True-compile-overhead] 0.5757ms 0.4726ms 2.1161 KOps/s 2.1426 KOps/s $\color{#d91a1a}-1.24\%$
test_func_call_cm_runtime[False-eager] 0.9843ms 0.5693ms 1.7566 KOps/s 1.8798 KOps/s $\textbf{\color{#d91a1a}-6.56\%}$
test_func_call_cm_runtime[False-compile] 0.7758ms 0.4301ms 2.3248 KOps/s 2.3226 KOps/s $\color{#35bf28}+0.10\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5814ms 0.4287ms 2.3329 KOps/s 2.3626 KOps/s $\color{#d91a1a}-1.26\%$
test_func_call_cm_runtime[True-eager] 1.3737ms 0.9173ms 1.0901 KOps/s 1.1275 KOps/s $\color{#d91a1a}-3.31\%$
test_func_call_cm_runtime[True-compile] 0.6033ms 0.4950ms 2.0200 KOps/s 2.0337 KOps/s $\color{#d91a1a}-0.67\%$
test_func_call_cm_runtime[True-compile-overhead] 0.7711ms 0.4961ms 2.0158 KOps/s 2.0465 KOps/s $\color{#d91a1a}-1.50\%$
test_vmap_func_call_cm_runtime[eager] 2.4427ms 1.8987ms 526.6645 Ops/s 516.3873 Ops/s $\color{#35bf28}+1.99\%$
test_vmap_func_call_cm_runtime[compile] 1.0666ms 0.5196ms 1.9246 KOps/s 1.9060 KOps/s $\color{#35bf28}+0.97\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7428ms 0.5161ms 1.9375 KOps/s 1.9095 KOps/s $\color{#35bf28}+1.46\%$
test_distributed 0.2587ms 0.1255ms 7.9692 KOps/s 7.8066 KOps/s $\color{#35bf28}+2.08\%$
test_tdmodule 55.4840μs 26.3921μs 37.8901 KOps/s 39.3800 KOps/s $\color{#d91a1a}-3.78\%$
test_tdmodule_dispatch 75.8820μs 48.1950μs 20.7491 KOps/s 19.7307 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_tdseq 47.0590μs 29.7543μs 33.6086 KOps/s 34.8220 KOps/s $\color{#d91a1a}-3.48\%$
test_tdseq_dispatch 84.2180μs 54.9245μs 18.2068 KOps/s 19.0674 KOps/s $\color{#d91a1a}-4.51\%$
test_instantiation_functorch 2.8232ms 1.5573ms 642.1433 Ops/s 646.1144 Ops/s $\color{#d91a1a}-0.61\%$
test_exec_functorch 0.3311ms 0.1813ms 5.5154 KOps/s 5.5769 KOps/s $\color{#d91a1a}-1.10\%$
test_exec_functional_call 0.4311ms 0.1753ms 5.7037 KOps/s 5.8307 KOps/s $\color{#d91a1a}-2.18\%$
test_exec_td_decorator 0.4446ms 0.2341ms 4.2722 KOps/s 4.2527 KOps/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed_decorator[True-True] 0.7653ms 0.6514ms 1.5352 KOps/s 1.5174 KOps/s $\color{#35bf28}+1.17\%$
test_vmap_mlp_speed_decorator[True-False] 0.8656ms 0.6592ms 1.5171 KOps/s 1.5239 KOps/s $\color{#d91a1a}-0.45\%$
test_vmap_mlp_speed_decorator[False-True] 0.8028ms 0.5285ms 1.8920 KOps/s 1.8436 KOps/s $\color{#35bf28}+2.63\%$
test_vmap_mlp_speed_decorator[False-False] 0.8091ms 0.5311ms 1.8828 KOps/s 1.8605 KOps/s $\color{#35bf28}+1.20\%$
test_to_module_speed[True] 1.6530ms 1.3324ms 750.5155 Ops/s 731.5214 Ops/s $\color{#35bf28}+2.60\%$
test_to_module_speed[False] 1.8099ms 1.2999ms 769.2620 Ops/s 757.6721 Ops/s $\color{#35bf28}+1.53\%$
test_tc_init 81.9640μs 49.1734μs 20.3362 KOps/s 21.3715 KOps/s $\color{#d91a1a}-4.84\%$
test_tc_init_nested 0.1911ms 96.8871μs 10.3213 KOps/s 10.6348 KOps/s $\color{#d91a1a}-2.95\%$
test_tc_first_layer_tensor 17.1220μs 1.5608μs 640.6984 KOps/s 665.0398 KOps/s $\color{#d91a1a}-3.66\%$
test_tc_first_layer_nontensor 24.4750μs 4.8530μs 206.0595 KOps/s 213.0473 KOps/s $\color{#d91a1a}-3.28\%$
test_tc_second_layer_tensor 34.5140μs 2.9004μs 344.7837 KOps/s 349.1528 KOps/s $\color{#d91a1a}-1.25\%$
test_tc_second_layer_nontensor 26.5700μs 6.2665μs 159.5797 KOps/s 165.6032 KOps/s $\color{#d91a1a}-3.64\%$
test_unbind 0.2159s 13.2058ms 75.7244 Ops/s 77.8041 Ops/s $\color{#d91a1a}-2.67\%$
test_full_like 19.2066ms 10.9520ms 91.3076 Ops/s 132.1902 Ops/s $\textbf{\color{#d91a1a}-30.93\%}$
test_zeros_like 12.3861ms 7.2682ms 137.5851 Ops/s 315.3309 Ops/s $\textbf{\color{#d91a1a}-56.37\%}$
test_ones_like 12.0888ms 7.2139ms 138.6220 Ops/s 162.3271 Ops/s $\textbf{\color{#d91a1a}-14.60\%}$
test_clone 16.1671ms 8.7794ms 113.9027 Ops/s 125.3611 Ops/s $\textbf{\color{#d91a1a}-9.14\%}$
test_squeeze 58.7910μs 12.2639μs 81.5400 KOps/s 80.7259 KOps/s $\color{#35bf28}+1.01\%$
test_unsqueeze 0.2426ms 93.0476μs 10.7472 KOps/s 11.0722 KOps/s $\color{#d91a1a}-2.94\%$
test_split 0.4347ms 0.2003ms 4.9923 KOps/s 5.1709 KOps/s $\color{#d91a1a}-3.45\%$
test_permute 0.3053ms 0.2044ms 4.8916 KOps/s 4.9106 KOps/s $\color{#d91a1a}-0.39\%$
test_stack 26.5248ms 24.2563ms 41.2264 Ops/s 39.9277 Ops/s $\color{#35bf28}+3.25\%$
test_cat 26.0377ms 23.8779ms 41.8797 Ops/s 40.2992 Ops/s $\color{#35bf28}+3.92\%$

Copy link

github-actions bot commented Jan 7, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}61$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 32.4810μs 11.3650μs 87.9893 KOps/s 76.6110 KOps/s $\textbf{\color{#35bf28}+14.85\%}$
test_plain_set_stack_nested 38.4210μs 11.6777μs 85.6334 KOps/s 75.0563 KOps/s $\textbf{\color{#35bf28}+14.09\%}$
test_plain_set_nested_inplace 37.7110μs 12.6588μs 78.9966 KOps/s 69.6893 KOps/s $\textbf{\color{#35bf28}+13.36\%}$
test_plain_set_stack_nested_inplace 36.7400μs 12.6816μs 78.8547 KOps/s 69.9451 KOps/s $\textbf{\color{#35bf28}+12.74\%}$
test_items 25.5710μs 2.8688μs 348.5822 KOps/s 343.8163 KOps/s $\color{#35bf28}+1.39\%$
test_items_nested 0.4174ms 0.3592ms 2.7841 KOps/s 2.8332 KOps/s $\color{#d91a1a}-1.73\%$
test_items_nested_locked 0.4163ms 0.3596ms 2.7810 KOps/s 2.8239 KOps/s $\color{#d91a1a}-1.52\%$
test_items_nested_leaf 90.1220μs 58.9224μs 16.9715 KOps/s 17.1641 KOps/s $\color{#d91a1a}-1.12\%$
test_items_stack_nested 0.4321ms 0.3643ms 2.7451 KOps/s 2.7892 KOps/s $\color{#d91a1a}-1.58\%$
test_items_stack_nested_leaf 0.1090ms 59.6081μs 16.7763 KOps/s 16.2109 KOps/s $\color{#35bf28}+3.49\%$
test_items_stack_nested_locked 0.4108ms 0.3642ms 2.7456 KOps/s 2.8090 KOps/s $\color{#d91a1a}-2.26\%$
test_keys 27.3210μs 3.4383μs 290.8438 KOps/s 290.3650 KOps/s $\color{#35bf28}+0.16\%$
test_keys_nested 0.1095ms 80.9174μs 12.3583 KOps/s 12.2747 KOps/s $\color{#35bf28}+0.68\%$
test_keys_nested_locked 0.7334ms 86.6380μs 11.5423 KOps/s 11.4507 KOps/s $\color{#35bf28}+0.80\%$
test_keys_nested_leaf 0.1131ms 71.6819μs 13.9505 KOps/s 13.8900 KOps/s $\color{#35bf28}+0.44\%$
test_keys_stack_nested 0.1239ms 81.5864μs 12.2569 KOps/s 12.1315 KOps/s $\color{#35bf28}+1.03\%$
test_keys_stack_nested_leaf 0.1227ms 73.1953μs 13.6621 KOps/s 13.4952 KOps/s $\color{#35bf28}+1.24\%$
test_keys_stack_nested_locked 0.1485ms 86.9836μs 11.4964 KOps/s 11.2804 KOps/s $\color{#35bf28}+1.91\%$
test_values 6.0417μs 0.8493μs 1.1774 MOps/s 1.1815 MOps/s $\color{#d91a1a}-0.35\%$
test_values_nested 67.7810μs 34.5725μs 28.9247 KOps/s 29.1175 KOps/s $\color{#d91a1a}-0.66\%$
test_values_nested_locked 64.4110μs 35.6190μs 28.0749 KOps/s 27.5802 KOps/s $\color{#35bf28}+1.79\%$
test_values_nested_leaf 80.1110μs 39.2807μs 25.4578 KOps/s 25.7134 KOps/s $\color{#d91a1a}-0.99\%$
test_values_stack_nested 82.3010μs 34.8922μs 28.6597 KOps/s 28.6792 KOps/s $\color{#d91a1a}-0.07\%$
test_values_stack_nested_leaf 80.2720μs 39.6768μs 25.2036 KOps/s 25.2472 KOps/s $\color{#d91a1a}-0.17\%$
test_values_stack_nested_locked 85.1120μs 36.2985μs 27.5493 KOps/s 27.4595 KOps/s $\color{#35bf28}+0.33\%$
test_membership 2.5381μs 0.5091μs 1.9641 MOps/s 1.9502 MOps/s $\color{#35bf28}+0.71\%$
test_membership_nested 27.5410μs 2.0736μs 482.2471 KOps/s 496.4883 KOps/s $\color{#d91a1a}-2.87\%$
test_membership_nested_leaf 19.6355μs 1.9605μs 510.0618 KOps/s 491.1853 KOps/s $\color{#35bf28}+3.84\%$
test_membership_stacked_nested 29.8910μs 2.0670μs 483.7925 KOps/s 466.5242 KOps/s $\color{#35bf28}+3.70\%$
test_membership_stacked_nested_leaf 27.0200μs 2.0462μs 488.7039 KOps/s 478.7313 KOps/s $\color{#35bf28}+2.08\%$
test_membership_nested_last 31.7400μs 3.0447μs 328.4439 KOps/s 317.9668 KOps/s $\color{#35bf28}+3.30\%$
test_membership_nested_leaf_last 32.8300μs 3.0958μs 323.0196 KOps/s 315.7166 KOps/s $\color{#35bf28}+2.31\%$
test_membership_stacked_nested_last 22.3300μs 3.6266μs 275.7423 KOps/s 274.1322 KOps/s $\color{#35bf28}+0.59\%$
test_membership_stacked_nested_leaf_last 31.7210μs 3.6066μs 277.2694 KOps/s 271.3986 KOps/s $\color{#35bf28}+2.16\%$
test_nested_getleaf 37.2710μs 6.1585μs 162.3778 KOps/s 163.4539 KOps/s $\color{#d91a1a}-0.66\%$
test_nested_get 52.5610μs 5.6868μs 175.8450 KOps/s 170.8357 KOps/s $\color{#35bf28}+2.93\%$
test_stacked_getleaf 88.8720μs 6.1240μs 163.2919 KOps/s 164.0851 KOps/s $\color{#d91a1a}-0.48\%$
test_stacked_get 32.1810μs 5.8771μs 170.1507 KOps/s 171.6579 KOps/s $\color{#d91a1a}-0.88\%$
test_nested_getitemleaf 28.7500μs 6.2077μs 161.0910 KOps/s 157.4166 KOps/s $\color{#35bf28}+2.33\%$
test_nested_getitem 26.5310μs 6.0075μs 166.4584 KOps/s 164.8710 KOps/s $\color{#35bf28}+0.96\%$
test_stacked_getitemleaf 36.8710μs 6.2210μs 160.7448 KOps/s 159.4680 KOps/s $\color{#35bf28}+0.80\%$
test_stacked_getitem 39.6000μs 5.9352μs 168.4876 KOps/s 164.9751 KOps/s $\color{#35bf28}+2.13\%$
test_lock_nested 0.9056ms 0.3746ms 2.6698 KOps/s 2.5553 KOps/s $\color{#35bf28}+4.48\%$
test_lock_stack_nested 0.3811ms 0.3452ms 2.8968 KOps/s 2.8492 KOps/s $\color{#35bf28}+1.67\%$
test_unlock_nested 0.7712ms 0.3159ms 3.1656 KOps/s 3.0832 KOps/s $\color{#35bf28}+2.67\%$
test_unlock_stack_nested 0.3274ms 0.2825ms 3.5393 KOps/s 3.4419 KOps/s $\color{#35bf28}+2.83\%$
test_flatten_speed 0.1112ms 75.5551μs 13.2354 KOps/s 13.3455 KOps/s $\color{#d91a1a}-0.82\%$
test_unflatten_speed 0.3828ms 0.3201ms 3.1241 KOps/s 3.0904 KOps/s $\color{#35bf28}+1.09\%$
test_common_ops 1.5448ms 0.5900ms 1.6948 KOps/s 1.5169 KOps/s $\textbf{\color{#35bf28}+11.73\%}$
test_creation 0.1125ms 1.7429μs 573.7458 KOps/s 573.1776 KOps/s $\color{#35bf28}+0.10\%$
test_creation_empty 32.3810μs 6.9719μs 143.4329 KOps/s 97.9392 KOps/s $\textbf{\color{#35bf28}+46.45\%}$
test_creation_nested_1 40.4410μs 8.5656μs 116.7463 KOps/s 84.1334 KOps/s $\textbf{\color{#35bf28}+38.76\%}$
test_creation_nested_2 45.9310μs 11.3412μs 88.1742 KOps/s 68.8724 KOps/s $\textbf{\color{#35bf28}+28.03\%}$
test_clone 98.6320μs 10.7496μs 93.0266 KOps/s 88.5847 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_getitem[int] 1.3809ms 10.8698μs 91.9984 KOps/s 88.3334 KOps/s $\color{#35bf28}+4.15\%$
test_getitem[slice_int] 0.1163ms 21.1742μs 47.2273 KOps/s 45.8847 KOps/s $\color{#35bf28}+2.93\%$
test_getitem[range] 0.1662ms 37.5908μs 26.6023 KOps/s 25.8476 KOps/s $\color{#35bf28}+2.92\%$
test_getitem[tuple] 0.1083ms 18.2409μs 54.8218 KOps/s 53.3317 KOps/s $\color{#35bf28}+2.79\%$
test_getitem[list] 0.3338ms 34.1131μs 29.3142 KOps/s 28.7900 KOps/s $\color{#35bf28}+1.82\%$
test_setitem_dim[int] 41.9510μs 19.3234μs 51.7508 KOps/s 49.7707 KOps/s $\color{#35bf28}+3.98\%$
test_setitem_dim[slice_int] 63.0010μs 39.3914μs 25.3862 KOps/s 26.3200 KOps/s $\color{#d91a1a}-3.55\%$
test_setitem_dim[range] 80.7420μs 53.9271μs 18.5435 KOps/s 18.6337 KOps/s $\color{#d91a1a}-0.48\%$
test_setitem_dim[tuple] 55.7620μs 32.8155μs 30.4734 KOps/s 30.3260 KOps/s $\color{#35bf28}+0.49\%$
test_setitem 0.1120ms 14.3498μs 69.6875 KOps/s 59.5905 KOps/s $\textbf{\color{#35bf28}+16.94\%}$
test_set 0.1052ms 13.9320μs 71.7771 KOps/s 60.2722 KOps/s $\textbf{\color{#35bf28}+19.09\%}$
test_set_shared 1.6281ms 0.1537ms 6.5061 KOps/s 6.5235 KOps/s $\color{#d91a1a}-0.27\%$
test_update 0.3200ms 16.3776μs 61.0592 KOps/s 50.0020 KOps/s $\textbf{\color{#35bf28}+22.11\%}$
test_update_nested 0.1041ms 21.4289μs 46.6659 KOps/s 37.9598 KOps/s $\textbf{\color{#35bf28}+22.94\%}$
test_update__nested 0.6004ms 25.7824μs 38.7861 KOps/s 37.7256 KOps/s $\color{#35bf28}+2.81\%$
test_set_nested 0.1052ms 15.2590μs 65.5352 KOps/s 56.7069 KOps/s $\textbf{\color{#35bf28}+15.57\%}$
test_set_nested_new 0.1094ms 17.3387μs 57.6746 KOps/s 50.0871 KOps/s $\textbf{\color{#35bf28}+15.15\%}$
test_select 0.1145ms 29.4270μs 33.9824 KOps/s 31.6969 KOps/s $\textbf{\color{#35bf28}+7.21\%}$
test_select_nested 0.1428ms 43.1794μs 23.1592 KOps/s 22.9248 KOps/s $\color{#35bf28}+1.02\%$
test_exclude_nested 91.4520μs 62.8634μs 15.9075 KOps/s 15.9699 KOps/s $\color{#d91a1a}-0.39\%$
test_empty[True] 0.3667ms 0.2895ms 3.4545 KOps/s 3.4600 KOps/s $\color{#d91a1a}-0.16\%$
test_empty[False] 3.3531μs 0.8330μs 1.2005 MOps/s 1.1985 MOps/s $\color{#35bf28}+0.17\%$
test_to 88.4820μs 56.7011μs 17.6363 KOps/s 17.4906 KOps/s $\color{#35bf28}+0.83\%$
test_to_nonblocking 97.3020μs 48.3255μs 20.6930 KOps/s 20.5311 KOps/s $\color{#35bf28}+0.79\%$
test_unbind_speed 0.2638ms 0.2346ms 4.2634 KOps/s 4.1603 KOps/s $\color{#35bf28}+2.48\%$
test_unbind_speed_stack0 0.3433ms 0.2356ms 4.2438 KOps/s 4.1368 KOps/s $\color{#35bf28}+2.59\%$
test_unbind_speed_stack1 92.8019ms 0.6620ms 1.5106 KOps/s 1.4951 KOps/s $\color{#35bf28}+1.04\%$
test_split 93.4185ms 1.6050ms 623.0350 Ops/s 620.4442 Ops/s $\color{#35bf28}+0.42\%$
test_chunk 93.5184ms 1.6020ms 624.2021 Ops/s 615.4158 Ops/s $\color{#35bf28}+1.43\%$
test_consolidate[False-None] 96.2327ms 2.9254ms 341.8379 Ops/s 337.4372 Ops/s $\color{#35bf28}+1.30\%$
test_consolidate[default-None] 1.8709ms 1.7050ms 586.4959 Ops/s 575.1674 Ops/s $\color{#35bf28}+1.97\%$
test_consolidate[reduce-overhead-None] 1.8751ms 1.6962ms 589.5517 Ops/s 565.7526 Ops/s $\color{#35bf28}+4.21\%$
test_consolidate_njt[False-None] 6.7364ms 6.3987ms 156.2816 Ops/s 151.7089 Ops/s $\color{#35bf28}+3.01\%$
test_to[False-False-None] 2.1711ms 1.7739ms 563.7423 Ops/s 568.2313 Ops/s $\color{#d91a1a}-0.79\%$
test_to[True-False-None] 1.7241ms 1.3233ms 755.6777 Ops/s 740.3055 Ops/s $\color{#35bf28}+2.08\%$
test_to[within-False-None] 4.4493ms 4.0606ms 246.2703 Ops/s 240.6818 Ops/s $\color{#35bf28}+2.32\%$
test_to[True-default-None] 5.6240ms 5.2266ms 191.3295 Ops/s 186.0363 Ops/s $\color{#35bf28}+2.85\%$
test_to_njt[False-False-None] 7.3805ms 6.9233ms 144.4397 Ops/s 144.6801 Ops/s $\color{#d91a1a}-0.17\%$
test_to_njt[True-False-None] 5.7382ms 5.4026ms 185.0977 Ops/s 183.1095 Ops/s $\color{#35bf28}+1.09\%$
test_to_njt[within-False-None] 12.4462ms 12.0345ms 83.0944 Ops/s 83.8068 Ops/s $\color{#d91a1a}-0.85\%$
test_creation[device0] 0.6362ms 81.5402μs 12.2639 KOps/s 12.2686 KOps/s $\color{#d91a1a}-0.04\%$
test_creation_from_tensor 0.5068ms 84.6081μs 11.8192 KOps/s 11.7170 KOps/s $\color{#35bf28}+0.87\%$
test_add_one[memmap_tensor0] 0.3273ms 6.9208μs 144.4910 KOps/s 141.3887 KOps/s $\color{#35bf28}+2.19\%$
test_contiguous[memmap_tensor0] 2.4336μs 0.4211μs 2.3749 MOps/s 2.3851 MOps/s $\color{#d91a1a}-0.43\%$
test_stack[memmap_tensor0] 23.0000μs 4.3580μs 229.4615 KOps/s 215.2638 KOps/s $\textbf{\color{#35bf28}+6.60\%}$
test_memmaptd_index 2.0759ms 0.2540ms 3.9370 KOps/s 3.8290 KOps/s $\color{#35bf28}+2.82\%$
test_memmaptd_index_astensor 0.8913ms 0.3170ms 3.1543 KOps/s 3.1137 KOps/s $\color{#35bf28}+1.31\%$
test_memmaptd_index_op 1.0150ms 0.5731ms 1.7448 KOps/s 1.5716 KOps/s $\textbf{\color{#35bf28}+11.02\%}$
test_serialize_model 0.1313s 0.1304s 7.6711 Ops/s 7.6455 Ops/s $\color{#35bf28}+0.34\%$
test_serialize_model_pickle 1.3504s 1.2139s 0.8238 Ops/s 0.8247 Ops/s $\color{#d91a1a}-0.10\%$
test_serialize_weights 0.1304s 0.1294s 7.7271 Ops/s 7.6886 Ops/s $\color{#35bf28}+0.50\%$
test_serialize_weights_returnearly 0.3369s 63.3585ms 15.7832 Ops/s 14.1798 Ops/s $\textbf{\color{#35bf28}+11.31\%}$
test_serialize_weights_pickle 1.3775s 1.2182s 0.8209 Ops/s 0.8220 Ops/s $\color{#d91a1a}-0.13\%$
test_reshape_pytree 72.1420μs 21.9548μs 45.5480 KOps/s 43.4350 KOps/s $\color{#35bf28}+4.86\%$
test_reshape_td 83.7610μs 26.4380μs 37.8244 KOps/s 34.7809 KOps/s $\textbf{\color{#35bf28}+8.75\%}$
test_view_pytree 86.9020μs 21.3936μs 46.7429 KOps/s 42.8736 KOps/s $\textbf{\color{#35bf28}+9.02\%}$
test_view_td 65.4210μs 29.6624μs 33.7127 KOps/s 29.2103 KOps/s $\textbf{\color{#35bf28}+15.41\%}$
test_unbind_pytree 59.6010μs 28.0015μs 35.7124 KOps/s 33.7267 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_unbind_td 0.7209ms 35.7980μs 27.9346 KOps/s 25.6658 KOps/s $\textbf{\color{#35bf28}+8.84\%}$
test_split_pytree 61.1410μs 29.0936μs 34.3718 KOps/s 31.0306 KOps/s $\textbf{\color{#35bf28}+10.77\%}$
test_split_td 0.9203ms 37.7995μs 26.4553 KOps/s 24.1782 KOps/s $\textbf{\color{#35bf28}+9.42\%}$
test_add_pytree 77.6420μs 34.4303μs 29.0442 KOps/s 27.3777 KOps/s $\textbf{\color{#35bf28}+6.09\%}$
test_add_td 98.9720μs 47.1307μs 21.2176 KOps/s 17.5366 KOps/s $\textbf{\color{#35bf28}+20.99\%}$
test_compile_add_one_nested[tensordict-compile] 0.1827ms 0.1179ms 8.4834 KOps/s 8.0563 KOps/s $\textbf{\color{#35bf28}+5.30\%}$
test_compile_add_one_nested[tensordict-eager] 0.2632ms 0.1281ms 7.8053 KOps/s 7.5159 KOps/s $\color{#35bf28}+3.85\%$
test_compile_add_one_nested[pytree-compile] 0.1462ms 94.3539μs 10.5984 KOps/s 10.0341 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_compile_add_one_nested[pytree-eager] 0.3258ms 0.1507ms 6.6335 KOps/s 6.4517 KOps/s $\color{#35bf28}+2.82\%$
test_compile_copy_nested[tensordict-compile] 93.3120μs 22.2595μs 44.9246 KOps/s 42.5976 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_compile_copy_nested[tensordict-eager] 68.6910μs 29.3211μs 34.1052 KOps/s 33.4911 KOps/s $\color{#35bf28}+1.83\%$
test_compile_copy_nested[pytree-compile] 0.4743ms 64.7040μs 15.4550 KOps/s 15.2251 KOps/s $\color{#35bf28}+1.51\%$
test_compile_copy_nested[pytree-eager] 90.8820μs 49.1614μs 20.3412 KOps/s 20.1244 KOps/s $\color{#35bf28}+1.08\%$
test_compile_add_one_flat[tensordict-compile] 0.1893ms 0.1382ms 7.2374 KOps/s 7.0009 KOps/s $\color{#35bf28}+3.38\%$
test_compile_add_one_flat[tensordict-eager] 0.3311ms 0.2158ms 4.6339 KOps/s 4.6333 KOps/s $\color{#35bf28}+0.01\%$
test_compile_add_one_flat[tensorclass-compile] 0.1492ms 95.5356μs 10.4673 KOps/s 10.1196 KOps/s $\color{#35bf28}+3.44\%$
test_compile_add_one_flat[tensorclass-eager] 0.1676ms 54.2442μs 18.4351 KOps/s 18.2232 KOps/s $\color{#35bf28}+1.16\%$
test_compile_add_one_flat[pytree-compile] 0.1714ms 0.1314ms 7.6122 KOps/s 7.3157 KOps/s $\color{#35bf28}+4.05\%$
test_compile_add_one_flat[pytree-eager] 0.5398ms 0.4963ms 2.0148 KOps/s 1.8766 KOps/s $\textbf{\color{#35bf28}+7.36\%}$
test_compile_add_self_flat[tensordict-eager] 0.3731ms 0.2568ms 3.8945 KOps/s 3.8445 KOps/s $\color{#35bf28}+1.30\%$
test_compile_add_self_flat[tensordict-compile] 0.1798ms 0.1388ms 7.2062 KOps/s 7.0580 KOps/s $\color{#35bf28}+2.10\%$
test_compile_add_self_flat[tensorclass-eager] 0.1680ms 64.4012μs 15.5277 KOps/s 14.6257 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_compile_add_self_flat[tensorclass-compile] 0.1366ms 96.6066μs 10.3513 KOps/s 10.2451 KOps/s $\color{#35bf28}+1.04\%$
test_compile_add_self_flat[pytree-eager] 0.4722ms 0.4164ms 2.4016 KOps/s 2.2472 KOps/s $\textbf{\color{#35bf28}+6.87\%}$
test_compile_add_self_flat[pytree-compile] 0.1701ms 0.1327ms 7.5360 KOps/s 7.1332 KOps/s $\textbf{\color{#35bf28}+5.65\%}$
test_compile_copy_flat[tensordict-compile] 0.1310ms 18.1222μs 55.1808 KOps/s 56.7885 KOps/s $\color{#d91a1a}-2.83\%$
test_compile_copy_flat[tensordict-eager] 64.1920μs 31.7606μs 31.4856 KOps/s 32.0638 KOps/s $\color{#d91a1a}-1.80\%$
test_compile_copy_flat[pytree-compile] 0.1094ms 69.8479μs 14.3168 KOps/s 14.1754 KOps/s $\color{#35bf28}+1.00\%$
test_compile_copy_flat[pytree-eager] 80.3320μs 51.7027μs 19.3413 KOps/s 19.4081 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_assign_and_add[tensordict-compile] 1.6595ms 0.3976ms 2.5154 KOps/s 2.2264 KOps/s $\textbf{\color{#35bf28}+12.98\%}$
test_compile_assign_and_add[tensordict-eager] 2.9115ms 2.6535ms 376.8631 Ops/s 359.9594 Ops/s $\color{#35bf28}+4.70\%$
test_compile_assign_and_add[pytree-compile] 1.6040ms 0.4333ms 2.3077 KOps/s 2.2004 KOps/s $\color{#35bf28}+4.88\%$
test_compile_assign_and_add[pytree-eager] 2.7937ms 2.6971ms 370.7701 Ops/s 356.0138 Ops/s $\color{#35bf28}+4.14\%$
test_compile_indexing[tensor-tensordict-compile] 0.6223ms 0.1156ms 8.6484 KOps/s 8.3225 KOps/s $\color{#35bf28}+3.92\%$
test_compile_indexing[tensor-tensordict-eager] 0.5764ms 81.4616μs 12.2757 KOps/s 11.7429 KOps/s $\color{#35bf28}+4.54\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5217ms 0.1119ms 8.9346 KOps/s 9.0681 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2003ms 70.5298μs 14.1784 KOps/s 13.8163 KOps/s $\color{#35bf28}+2.62\%$
test_compile_indexing[tensor-pytree-compile] 0.1652ms 0.1158ms 8.6365 KOps/s 8.8227 KOps/s $\color{#d91a1a}-2.11\%$
test_compile_indexing[tensor-pytree-eager] 0.1135ms 69.6358μs 14.3604 KOps/s 13.5475 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_compile_indexing[slice-tensordict-compile] 0.1594ms 0.1053ms 9.4991 KOps/s 9.8420 KOps/s $\color{#d91a1a}-3.48\%$
test_compile_indexing[slice-tensordict-eager] 0.1432ms 16.8935μs 59.1944 KOps/s 54.6191 KOps/s $\textbf{\color{#35bf28}+8.38\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1714ms 0.1011ms 9.8911 KOps/s 10.1704 KOps/s $\color{#d91a1a}-2.75\%$
test_compile_indexing[slice-tensorclass-eager] 66.3810μs 15.4616μs 64.6765 KOps/s 60.3301 KOps/s $\textbf{\color{#35bf28}+7.20\%}$
test_compile_indexing[slice-pytree-compile] 0.1521ms 0.1013ms 9.8675 KOps/s 10.1058 KOps/s $\color{#d91a1a}-2.36\%$
test_compile_indexing[slice-pytree-eager] 79.7520μs 15.5561μs 64.2836 KOps/s 60.7713 KOps/s $\textbf{\color{#35bf28}+5.78\%}$
test_compile_indexing[int-tensordict-compile] 0.1550ms 0.1069ms 9.3529 KOps/s 9.7444 KOps/s $\color{#d91a1a}-4.02\%$
test_compile_indexing[int-tensordict-eager] 0.5754ms 16.7895μs 59.5610 KOps/s 55.8740 KOps/s $\textbf{\color{#35bf28}+6.60\%}$
test_compile_indexing[int-tensorclass-compile] 0.1647ms 97.2282μs 10.2851 KOps/s 10.1195 KOps/s $\color{#35bf28}+1.64\%$
test_compile_indexing[int-tensorclass-eager] 0.1163ms 15.3653μs 65.0819 KOps/s 61.4494 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_compile_indexing[int-pytree-compile] 0.2238ms 0.1014ms 9.8619 KOps/s 10.1071 KOps/s $\color{#d91a1a}-2.43\%$
test_compile_indexing[int-pytree-eager] 0.4109ms 15.6861μs 63.7506 KOps/s 61.1754 KOps/s $\color{#35bf28}+4.21\%$
test_mod_add[eager] 83.9910μs 37.0493μs 26.9911 KOps/s 25.3344 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_mod_add[compile] 0.2242ms 78.1661μs 12.7933 KOps/s 12.3967 KOps/s $\color{#35bf28}+3.20\%$
test_mod_add[compile-overhead] 0.3241ms 0.1658ms 6.0331 KOps/s 5.6811 KOps/s $\textbf{\color{#35bf28}+6.20\%}$
test_mod_wrap[eager] 0.3971ms 0.2529ms 3.9534 KOps/s 3.6574 KOps/s $\textbf{\color{#35bf28}+8.09\%}$
test_mod_wrap[compile] 1.1139ms 0.2839ms 3.5221 KOps/s 3.4423 KOps/s $\color{#35bf28}+2.32\%$
test_mod_wrap[compile-overhead] 6.5189ms 3.6029ms 277.5564 Ops/s 272.9629 Ops/s $\color{#35bf28}+1.68\%$
test_mod_wrap_and_backward[eager] 1.6530ms 1.3804ms 724.4530 Ops/s 671.6130 Ops/s $\textbf{\color{#35bf28}+7.87\%}$
test_mod_wrap_and_backward[compile] 1.3820ms 1.2757ms 783.8707 Ops/s 713.1238 Ops/s $\textbf{\color{#35bf28}+9.92\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3724ms 0.9285ms 1.0770 KOps/s 963.1642 Ops/s $\textbf{\color{#35bf28}+11.82\%}$
test_seq_add[eager] 0.2070ms 0.1127ms 8.8698 KOps/s 8.0184 KOps/s $\textbf{\color{#35bf28}+10.62\%}$
test_seq_add[compile] 0.1471ms 90.2877μs 11.0757 KOps/s 10.5502 KOps/s $\color{#35bf28}+4.98\%$
test_seq_add[compile-overhead] 0.1730ms 0.1287ms 7.7685 KOps/s 7.3468 KOps/s $\textbf{\color{#35bf28}+5.74\%}$
test_seq_wrap[eager] 0.4913ms 0.4177ms 2.3939 KOps/s 2.1773 KOps/s $\textbf{\color{#35bf28}+9.95\%}$
test_seq_wrap[compile] 0.3707ms 0.2995ms 3.3385 KOps/s 3.2447 KOps/s $\color{#35bf28}+2.89\%$
test_seq_wrap[compile-overhead] 0.2709ms 0.2226ms 4.4916 KOps/s 4.3729 KOps/s $\color{#35bf28}+2.71\%$
test_func_call_runtime[False-eager] 0.8818ms 0.7991ms 1.2515 KOps/s 1.3166 KOps/s $\color{#d91a1a}-4.95\%$
test_func_call_runtime[False-compile] 0.8570ms 0.7512ms 1.3313 KOps/s 1.2662 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_func_call_runtime[False-compile-overhead] 0.4131ms 0.3622ms 2.7608 KOps/s 2.7226 KOps/s $\color{#35bf28}+1.41\%$
test_func_call_runtime[True-eager] 0.9862ms 0.9164ms 1.0913 KOps/s 1.0060 KOps/s $\textbf{\color{#35bf28}+8.47\%}$
test_func_call_runtime[True-compile] 0.8698ms 0.7765ms 1.2878 KOps/s 1.2714 KOps/s $\color{#35bf28}+1.29\%$
test_func_call_runtime[True-compile-overhead] 0.4420ms 0.3859ms 2.5911 KOps/s 2.5843 KOps/s $\color{#35bf28}+0.26\%$
test_func_call_cm_runtime[False-eager] 0.8417ms 0.7775ms 1.2862 KOps/s 1.2753 KOps/s $\color{#35bf28}+0.86\%$
test_func_call_cm_runtime[False-compile] 0.8289ms 0.7483ms 1.3364 KOps/s 1.3070 KOps/s $\color{#35bf28}+2.25\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4114ms 0.3654ms 2.7369 KOps/s 2.6994 KOps/s $\color{#35bf28}+1.39\%$
test_func_call_cm_runtime[True-eager] 1.1112ms 1.0153ms 984.9118 Ops/s 978.6642 Ops/s $\color{#35bf28}+0.64\%$
test_func_call_cm_runtime[True-compile] 0.8816ms 0.7969ms 1.2549 KOps/s 1.2429 KOps/s $\color{#35bf28}+0.96\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4637ms 0.4112ms 2.4320 KOps/s 2.4019 KOps/s $\color{#35bf28}+1.25\%$
test_vmap_func_call_cm_runtime[eager] 2.6004ms 2.1072ms 474.5730 Ops/s 469.3354 Ops/s $\color{#35bf28}+1.12\%$
test_vmap_func_call_cm_runtime[compile] 0.9176ms 0.8238ms 1.2139 KOps/s 1.2058 KOps/s $\color{#35bf28}+0.67\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4872ms 0.4136ms 2.4180 KOps/s 2.3990 KOps/s $\color{#35bf28}+0.79\%$
test_distributed 2.8214ms 0.3278ms 3.0510 KOps/s 7.7865 KOps/s $\textbf{\color{#d91a1a}-60.82\%}$
test_tdmodule 0.2273ms 19.4313μs 51.4633 KOps/s 47.9775 KOps/s $\textbf{\color{#35bf28}+7.27\%}$
test_tdmodule_dispatch 68.8110μs 34.8563μs 28.6892 KOps/s 26.6671 KOps/s $\textbf{\color{#35bf28}+7.58\%}$
test_tdseq 31.0110μs 19.7285μs 50.6881 KOps/s 45.7608 KOps/s $\textbf{\color{#35bf28}+10.77\%}$
test_tdseq_dispatch 58.9820μs 36.1273μs 27.6799 KOps/s 24.2722 KOps/s $\textbf{\color{#35bf28}+14.04\%}$
test_instantiation_functorch 1.6729ms 1.5358ms 651.1392 Ops/s 633.8500 Ops/s $\color{#35bf28}+2.73\%$
test_exec_functorch 0.2086ms 0.1464ms 6.8299 KOps/s 6.3499 KOps/s $\textbf{\color{#35bf28}+7.56\%}$
test_exec_functional_call 0.1867ms 0.1401ms 7.1364 KOps/s 6.7198 KOps/s $\textbf{\color{#35bf28}+6.20\%}$
test_exec_td_decorator 0.3824ms 0.1880ms 5.3201 KOps/s 5.2827 KOps/s $\color{#35bf28}+0.71\%$
test_vmap_mlp_speed_decorator[True-True] 0.7773ms 0.6906ms 1.4480 KOps/s 1.4093 KOps/s $\color{#35bf28}+2.75\%$
test_vmap_mlp_speed_decorator[True-False] 0.8486ms 0.6890ms 1.4514 KOps/s 1.4026 KOps/s $\color{#35bf28}+3.48\%$
test_vmap_mlp_speed_decorator[False-True] 0.7217ms 0.6006ms 1.6649 KOps/s 1.5910 KOps/s $\color{#35bf28}+4.64\%$
test_vmap_mlp_speed_decorator[False-False] 0.7192ms 0.6022ms 1.6605 KOps/s 1.6483 KOps/s $\color{#35bf28}+0.74\%$
test_vmap_transformer_speed_decorator[True-True] 19.6092ms 19.4771ms 51.3424 Ops/s 51.3120 Ops/s $\color{#35bf28}+0.06\%$
test_vmap_transformer_speed_decorator[True-False] 20.2600ms 19.6257ms 50.9535 Ops/s 51.4737 Ops/s $\color{#d91a1a}-1.01\%$
test_vmap_transformer_speed_decorator[False-True] 19.5435ms 19.4431ms 51.4322 Ops/s 51.8253 Ops/s $\color{#d91a1a}-0.76\%$
test_vmap_transformer_speed_decorator[False-False] 19.4893ms 19.4349ms 51.4537 Ops/s 51.6833 Ops/s $\color{#d91a1a}-0.44\%$
test_to_module_speed[True] 1.1115ms 0.9711ms 1.0298 KOps/s 1.0305 KOps/s $\color{#d91a1a}-0.07\%$
test_to_module_speed[False] 1.0727ms 0.9435ms 1.0599 KOps/s 1.0448 KOps/s $\color{#35bf28}+1.45\%$
test_tc_init 68.2320μs 34.6385μs 28.8696 KOps/s 27.0585 KOps/s $\textbf{\color{#35bf28}+6.69\%}$
test_tc_init_nested 0.1141ms 70.7381μs 14.1367 KOps/s 13.3625 KOps/s $\textbf{\color{#35bf28}+5.79\%}$
test_tc_first_layer_tensor 29.7000μs 0.8187μs 1.2214 MOps/s 1.2051 MOps/s $\color{#35bf28}+1.35\%$
test_tc_first_layer_nontensor 28.4010μs 2.2884μs 436.9899 KOps/s 437.4432 KOps/s $\color{#d91a1a}-0.10\%$
test_tc_second_layer_tensor 8.2325μs 1.4146μs 706.8946 KOps/s 689.7571 KOps/s $\color{#35bf28}+2.48\%$
test_tc_second_layer_nontensor 25.8500μs 3.0245μs 330.6293 KOps/s 325.6972 KOps/s $\color{#35bf28}+1.51\%$
test_unbind 7.2682ms 6.9924ms 143.0129 Ops/s 144.6816 Ops/s $\color{#d91a1a}-1.15\%$
test_full_like 9.2198ms 9.1117ms 109.7485 Ops/s 108.5052 Ops/s $\color{#35bf28}+1.15\%$
test_zeros_like 4.8379ms 4.3228ms 231.3333 Ops/s 230.6655 Ops/s $\color{#35bf28}+0.29\%$
test_ones_like 5.3750ms 4.3241ms 231.2608 Ops/s 231.1621 Ops/s $\color{#35bf28}+0.04\%$
test_clone 6.4720ms 6.3586ms 157.2664 Ops/s 156.4724 Ops/s $\color{#35bf28}+0.51\%$
test_squeeze 82.1320μs 9.7365μs 102.7068 KOps/s 106.1526 KOps/s $\color{#d91a1a}-3.25\%$
test_unsqueeze 0.1234ms 73.1868μs 13.6637 KOps/s 14.0405 KOps/s $\color{#d91a1a}-2.68\%$
test_split 0.3311ms 0.1552ms 6.4449 KOps/s 6.2482 KOps/s $\color{#35bf28}+3.15\%$
test_permute 0.3264ms 0.1796ms 5.5681 KOps/s 5.4937 KOps/s $\color{#35bf28}+1.35\%$
test_stack 50.8467ms 50.4588ms 19.8182 Ops/s 20.0407 Ops/s $\color{#d91a1a}-1.11\%$
test_cat 50.7124ms 50.3999ms 19.8413 Ops/s 19.9783 Ops/s $\color{#d91a1a}-0.69\%$

@vmoens vmoens added the bug Something isn't working label Jan 7, 2025
@vmoens vmoens merged commit d6dc148 into gh/vmoens/44/base Jan 7, 2025
51 of 55 checks passed
vmoens added a commit that referenced this pull request Jan 7, 2025
ghstack-source-id: 2d117645769890b72f5856f68acbe1b48015cfbb
Pull Request resolved: #1164
@vmoens vmoens deleted the gh/vmoens/44/head branch January 7, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants