Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Make all leaves in tensorclass part of _tensordict, except for NonTensorData #841

Merged
merged 7 commits into from
Jul 3, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 27, 2024

This is bc-breaking in the following ways:

  1. non-tensor data are proper leaves in the tensorclass
data = MyData(X=X, y=y, z="a string", batch_size=batch_size)
assert "z" in data.keys() # used to break
  1. non-tensor data will be compared if a tensorclass is compared to another tc / td
# Previously
z = "a striing!"
tensordict = TensorDict(
            {
                "X": X,
                "y": y,
            },
            batch_size=[3, 4],
)
data = MyData(X=X, y=y, z=z, batch_size=batch_size)
assert (tensordict == data).all()
# Now z needs to be part of the tensordict as it won't be ignored during comparison
tensordict = TensorDict(
            {
                "X": X,
                "y": y,
            },
            batch_size=[3, 4],
)
  1. Non-tensor data following comparison is not None
data0 = MyData(X=X, y=y, z="a string", batch_size=batch_size)
data1 = MyData(X=X, y=y, z="another string", batch_size=batch_size)
(data0 == data1).z # used to be None, now a TD with boolean values
  1. when setting non-tensor values in-place will now return a ValueError, not RuntimeError
  2. This now works BUT it will convert any NonTensorData in data in a NonTensorStack (since values depend on their location in the batch):
data0 = MyData(X=X, y=y, z="a string", batch_size=batch_size)
data1 = MyData(X=X, y=y, z="another string", batch_size=batch_size)
data0[:2] = data1[:2]
data0.z # used to be a string, bc ignored by __setitem__, now a list

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 27, 2024
@vmoens vmoens added Refactor Refactoring code - not a new feature BC-breaking and removed CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. labels Jun 27, 2024
@vmoens vmoens linked an issue Jun 27, 2024 that may be closed by this pull request
3 tasks
@vmoens
Copy link
Contributor Author

vmoens commented Jun 27, 2024

@maximilianigl
I think this will resolve #717 but the price to pay is some bc-breaking changes - hopefully for the better.

Copy link

github-actions bot commented Jun 27, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}25$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 45.6360μs 16.8935μs 59.1944 KOps/s 61.1181 KOps/s $\color{#d91a1a}-3.15\%$
test_plain_set_stack_nested 83.7960μs 17.0410μs 58.6821 KOps/s 60.5688 KOps/s $\color{#d91a1a}-3.12\%$
test_plain_set_nested_inplace 49.7830μs 19.3593μs 51.6547 KOps/s 54.0189 KOps/s $\color{#d91a1a}-4.38\%$
test_plain_set_stack_nested_inplace 50.6040μs 19.4057μs 51.5312 KOps/s 54.3718 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_items 23.1830μs 2.5853μs 386.8093 KOps/s 384.1491 KOps/s $\color{#35bf28}+0.69\%$
test_items_nested 1.0495ms 0.2742ms 3.6470 KOps/s 3.6430 KOps/s $\color{#35bf28}+0.11\%$
test_items_nested_locked 0.5926ms 0.2740ms 3.6491 KOps/s 3.6403 KOps/s $\color{#35bf28}+0.24\%$
test_items_nested_leaf 0.1500ms 79.0817μs 12.6451 KOps/s 12.7574 KOps/s $\color{#d91a1a}-0.88\%$
test_items_stack_nested 0.5175ms 0.2766ms 3.6147 KOps/s 3.6195 KOps/s $\color{#d91a1a}-0.13\%$
test_items_stack_nested_leaf 0.1780ms 80.9080μs 12.3597 KOps/s 12.5693 KOps/s $\color{#d91a1a}-1.67\%$
test_items_stack_nested_locked 1.1009ms 0.2757ms 3.6265 KOps/s 3.6289 KOps/s $\color{#d91a1a}-0.07\%$
test_keys 23.8150μs 3.9916μs 250.5238 KOps/s 255.9985 KOps/s $\color{#d91a1a}-2.14\%$
test_keys_nested 0.2400ms 0.1405ms 7.1166 KOps/s 7.2041 KOps/s $\color{#d91a1a}-1.21\%$
test_keys_nested_locked 0.6964ms 0.1456ms 6.8672 KOps/s 6.8671 KOps/s $+0.00\%$
test_keys_nested_leaf 0.1940ms 0.1192ms 8.3903 KOps/s 8.4299 KOps/s $\color{#d91a1a}-0.47\%$
test_keys_stack_nested 0.2827ms 0.1410ms 7.0922 KOps/s 7.2511 KOps/s $\color{#d91a1a}-2.19\%$
test_keys_stack_nested_leaf 0.2555ms 0.1185ms 8.4390 KOps/s 8.2965 KOps/s $\color{#35bf28}+1.72\%$
test_keys_stack_nested_locked 0.2754ms 0.1436ms 6.9647 KOps/s 6.9652 KOps/s $-0.01\%$
test_values 9.4878μs 1.3622μs 734.1001 KOps/s 870.7083 KOps/s $\textbf{\color{#d91a1a}-15.69\%}$
test_values_nested 98.2430μs 50.0145μs 19.9942 KOps/s 19.3721 KOps/s $\color{#35bf28}+3.21\%$
test_values_nested_locked 0.1062ms 50.3982μs 19.8420 KOps/s 19.0337 KOps/s $\color{#35bf28}+4.25\%$
test_values_nested_leaf 88.9460μs 45.4734μs 21.9909 KOps/s 21.7282 KOps/s $\color{#35bf28}+1.21\%$
test_values_stack_nested 0.1372ms 51.9780μs 19.2389 KOps/s 19.2777 KOps/s $\color{#d91a1a}-0.20\%$
test_values_stack_nested_leaf 0.1028ms 45.5084μs 21.9740 KOps/s 21.8951 KOps/s $\color{#35bf28}+0.36\%$
test_values_stack_nested_locked 0.1058ms 51.5804μs 19.3872 KOps/s 19.3715 KOps/s $\color{#35bf28}+0.08\%$
test_membership 30.2380μs 1.3023μs 767.8771 KOps/s 729.0072 KOps/s $\textbf{\color{#35bf28}+5.33\%}$
test_membership_nested 19.3970μs 3.4466μs 290.1395 KOps/s 292.7133 KOps/s $\color{#d91a1a}-0.88\%$
test_membership_nested_leaf 29.6750μs 3.4708μs 288.1194 KOps/s 290.5534 KOps/s $\color{#d91a1a}-0.84\%$
test_membership_stacked_nested 21.4400μs 3.4546μs 289.4730 KOps/s 293.2454 KOps/s $\color{#d91a1a}-1.29\%$
test_membership_stacked_nested_leaf 33.7730μs 3.4890μs 286.6125 KOps/s 289.9980 KOps/s $\color{#d91a1a}-1.17\%$
test_membership_nested_last 35.9970μs 4.2339μs 236.1911 KOps/s 239.0085 KOps/s $\color{#d91a1a}-1.18\%$
test_membership_nested_leaf_last 26.3100μs 4.2409μs 235.7986 KOps/s 239.4739 KOps/s $\color{#d91a1a}-1.53\%$
test_membership_stacked_nested_last 26.4790μs 4.2031μs 237.9190 KOps/s 209.1378 KOps/s $\textbf{\color{#35bf28}+13.76\%}$
test_membership_stacked_nested_leaf_last 45.8150μs 4.2218μs 236.8637 KOps/s 205.4298 KOps/s $\textbf{\color{#35bf28}+15.30\%}$
test_nested_getleaf 59.1310μs 10.6830μs 93.6070 KOps/s 92.3243 KOps/s $\color{#35bf28}+1.39\%$
test_nested_get 43.2810μs 10.1189μs 98.8249 KOps/s 99.2740 KOps/s $\color{#d91a1a}-0.45\%$
test_stacked_getleaf 95.7380μs 10.6647μs 93.7677 KOps/s 91.9675 KOps/s $\color{#35bf28}+1.96\%$
test_stacked_get 27.5220μs 9.9428μs 100.5757 KOps/s 98.0253 KOps/s $\color{#35bf28}+2.60\%$
test_nested_getitemleaf 51.5460μs 11.3099μs 88.4179 KOps/s 87.2602 KOps/s $\color{#35bf28}+1.33\%$
test_nested_getitem 44.7840μs 10.4349μs 95.8323 KOps/s 95.2395 KOps/s $\color{#35bf28}+0.62\%$
test_stacked_getitemleaf 34.0130μs 11.1670μs 89.5497 KOps/s 91.6482 KOps/s $\color{#d91a1a}-2.29\%$
test_stacked_getitem 36.8990μs 10.2357μs 97.6971 KOps/s 96.0772 KOps/s $\color{#35bf28}+1.69\%$
test_lock_nested 0.9288ms 0.3400ms 2.9415 KOps/s 2.9866 KOps/s $\color{#d91a1a}-1.51\%$
test_lock_stack_nested 0.6067ms 0.3041ms 3.2883 KOps/s 3.3463 KOps/s $\color{#d91a1a}-1.73\%$
test_unlock_nested 0.7490ms 0.3444ms 2.9038 KOps/s 2.9096 KOps/s $\color{#d91a1a}-0.20\%$
test_unlock_stack_nested 0.4931ms 0.3131ms 3.1938 KOps/s 3.2527 KOps/s $\color{#d91a1a}-1.81\%$
test_flatten_speed 0.5934ms 98.4019μs 10.1624 KOps/s 10.1982 KOps/s $\color{#d91a1a}-0.35\%$
test_unflatten_speed 0.7085ms 0.4107ms 2.4348 KOps/s 2.4728 KOps/s $\color{#d91a1a}-1.54\%$
test_common_ops 3.0883ms 0.7495ms 1.3342 KOps/s 1.4493 KOps/s $\textbf{\color{#d91a1a}-7.94\%}$
test_creation 75.2210μs 2.1124μs 473.3913 KOps/s 534.2178 KOps/s $\textbf{\color{#d91a1a}-11.39\%}$
test_creation_empty 33.4320μs 11.3941μs 87.7647 KOps/s 105.4938 KOps/s $\textbf{\color{#d91a1a}-16.81\%}$
test_creation_nested_1 51.8160μs 14.1561μs 70.6410 KOps/s 81.9022 KOps/s $\textbf{\color{#d91a1a}-13.75\%}$
test_creation_nested_2 42.1290μs 17.6398μs 56.6901 KOps/s 64.7164 KOps/s $\textbf{\color{#d91a1a}-12.40\%}$
test_clone 93.4440μs 12.8933μs 77.5597 KOps/s 75.8859 KOps/s $\color{#35bf28}+2.21\%$
test_getitem[int] 26.3490μs 11.4108μs 87.6364 KOps/s 89.5399 KOps/s $\color{#d91a1a}-2.13\%$
test_getitem[slice_int] 57.0060μs 22.9643μs 43.5459 KOps/s 44.4743 KOps/s $\color{#d91a1a}-2.09\%$
test_getitem[range] 80.7610μs 57.7378μs 17.3197 KOps/s 16.4877 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_getitem[tuple] 73.3760μs 19.0318μs 52.5435 KOps/s 53.1897 KOps/s $\color{#d91a1a}-1.21\%$
test_getitem[list] 0.1235ms 40.1635μs 24.8982 KOps/s 24.2951 KOps/s $\color{#35bf28}+2.48\%$
test_setitem_dim[int] 84.3970μs 33.2525μs 30.0729 KOps/s 30.2901 KOps/s $\color{#d91a1a}-0.72\%$
test_setitem_dim[slice_int] 0.1082ms 60.7947μs 16.4488 KOps/s 16.8447 KOps/s $\color{#d91a1a}-2.35\%$
test_setitem_dim[range] 0.1155ms 81.5430μs 12.2635 KOps/s 11.9985 KOps/s $\color{#35bf28}+2.21\%$
test_setitem_dim[tuple] 86.5910μs 49.9772μs 20.0091 KOps/s 20.3549 KOps/s $\color{#d91a1a}-1.70\%$
test_setitem 57.2970μs 19.8162μs 50.4637 KOps/s 52.0902 KOps/s $\color{#d91a1a}-3.12\%$
test_set 61.9960μs 19.3842μs 51.5885 KOps/s 53.5790 KOps/s $\color{#d91a1a}-3.71\%$
test_set_shared 1.5016ms 0.1477ms 6.7693 KOps/s 7.1748 KOps/s $\textbf{\color{#d91a1a}-5.65\%}$
test_update 0.1452ms 22.3203μs 44.8023 KOps/s 46.7151 KOps/s $\color{#d91a1a}-4.09\%$
test_update_nested 0.1438ms 31.4715μs 31.7748 KOps/s 33.6567 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_update__nested 96.0490μs 24.8061μs 40.3126 KOps/s 40.1979 KOps/s $\color{#35bf28}+0.29\%$
test_set_nested 70.6210μs 21.4981μs 46.5157 KOps/s 48.0622 KOps/s $\color{#d91a1a}-3.22\%$
test_set_nested_new 76.0120μs 25.4554μs 39.2845 KOps/s 40.2032 KOps/s $\color{#d91a1a}-2.29\%$
test_select 97.8020μs 40.6874μs 24.5776 KOps/s 25.3403 KOps/s $\color{#d91a1a}-3.01\%$
test_select_nested 0.1317ms 58.7094μs 17.0331 KOps/s 17.2388 KOps/s $\color{#d91a1a}-1.19\%$
test_exclude_nested 0.2217ms 0.1194ms 8.3741 KOps/s 8.5322 KOps/s $\color{#d91a1a}-1.85\%$
test_empty[True] 0.5916ms 0.4016ms 2.4898 KOps/s 2.5212 KOps/s $\color{#d91a1a}-1.25\%$
test_empty[False] 5.4160μs 1.0360μs 965.2148 KOps/s 985.9476 KOps/s $\color{#d91a1a}-2.10\%$
test_unbind_speed 0.4517ms 0.2481ms 4.0313 KOps/s 4.0531 KOps/s $\color{#d91a1a}-0.54\%$
test_unbind_speed_stack0 0.4842ms 0.2470ms 4.0493 KOps/s 4.1303 KOps/s $\color{#d91a1a}-1.96\%$
test_unbind_speed_stack1 73.5439ms 0.7277ms 1.3741 KOps/s 1.4302 KOps/s $\color{#d91a1a}-3.92\%$
test_split 73.1228ms 1.6108ms 620.8169 Ops/s 630.5732 Ops/s $\color{#d91a1a}-1.55\%$
test_chunk 73.2610ms 1.6251ms 615.3361 Ops/s 628.4813 Ops/s $\color{#d91a1a}-2.09\%$
test_creation[device0] 0.2602ms 86.1911μs 11.6021 KOps/s 11.5879 KOps/s $\color{#35bf28}+0.12\%$
test_creation_from_tensor 3.4010ms 87.8761μs 11.3797 KOps/s 11.6607 KOps/s $\color{#d91a1a}-2.41\%$
test_add_one[memmap_tensor0] 83.0850μs 5.3408μs 187.2386 KOps/s 175.8883 KOps/s $\textbf{\color{#35bf28}+6.45\%}$
test_contiguous[memmap_tensor0] 18.1340μs 0.6428μs 1.5557 MOps/s 1.5389 MOps/s $\color{#35bf28}+1.10\%$
test_stack[memmap_tensor0] 29.7960μs 3.5845μs 278.9798 KOps/s 263.5372 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_memmaptd_index 1.1122ms 0.2618ms 3.8203 KOps/s 3.9488 KOps/s $\color{#d91a1a}-3.25\%$
test_memmaptd_index_astensor 0.6034ms 0.3363ms 2.9732 KOps/s 3.0585 KOps/s $\color{#d91a1a}-2.79\%$
test_memmaptd_index_op 1.0490ms 0.6288ms 1.5904 KOps/s 1.6878 KOps/s $\textbf{\color{#d91a1a}-5.77\%}$
test_serialize_model 0.1010s 97.1051ms 10.2981 Ops/s 10.2794 Ops/s $\color{#35bf28}+0.18\%$
test_serialize_model_pickle 0.4482s 0.3809s 2.6253 Ops/s 2.6311 Ops/s $\color{#d91a1a}-0.22\%$
test_serialize_weights 0.1027s 95.1982ms 10.5044 Ops/s 9.6067 Ops/s $\textbf{\color{#35bf28}+9.34\%}$
test_serialize_weights_returnearly 0.1830s 0.1276s 7.8397 Ops/s 7.8795 Ops/s $\color{#d91a1a}-0.50\%$
test_serialize_weights_pickle 0.9455s 0.5424s 1.8435 Ops/s 2.4451 Ops/s $\textbf{\color{#d91a1a}-24.60\%}$
test_serialize_weights_filesystem 98.7438ms 93.8593ms 10.6542 Ops/s 9.7707 Ops/s $\textbf{\color{#35bf28}+9.04\%}$
test_serialize_model_filesystem 0.1683s 0.1023s 9.7794 Ops/s 10.5633 Ops/s $\textbf{\color{#d91a1a}-7.42\%}$
test_reshape_pytree 70.2510μs 26.1429μs 38.2513 KOps/s 38.7818 KOps/s $\color{#d91a1a}-1.37\%$
test_reshape_td 72.1040μs 33.8406μs 29.5503 KOps/s 29.2907 KOps/s $\color{#35bf28}+0.89\%$
test_view_pytree 58.3090μs 25.5680μs 39.1114 KOps/s 39.1761 KOps/s $\color{#d91a1a}-0.17\%$
test_view_td 0.1621ms 39.8385μs 25.1013 KOps/s 25.0641 KOps/s $\color{#35bf28}+0.15\%$
test_unbind_pytree 0.1205ms 29.1407μs 34.3163 KOps/s 33.3814 KOps/s $\color{#35bf28}+2.80\%$
test_unbind_td 0.4025ms 36.3067μs 27.5432 KOps/s 27.1166 KOps/s $\color{#35bf28}+1.57\%$
test_split_pytree 86.5220μs 29.5684μs 33.8198 KOps/s 34.3330 KOps/s $\color{#d91a1a}-1.49\%$
test_split_td 0.1273ms 40.6290μs 24.6130 KOps/s 25.4215 KOps/s $\color{#d91a1a}-3.18\%$
test_add_pytree 0.1039ms 34.9826μs 28.5857 KOps/s 28.6156 KOps/s $\color{#d91a1a}-0.10\%$
test_add_td 0.1344ms 55.2859μs 18.0878 KOps/s 19.1568 KOps/s $\textbf{\color{#d91a1a}-5.58\%}$
test_distributed 0.2444ms 0.1060ms 9.4352 KOps/s 9.6962 KOps/s $\color{#d91a1a}-2.69\%$
test_tdmodule 30.3470μs 17.5056μs 57.1246 KOps/s 58.2021 KOps/s $\color{#d91a1a}-1.85\%$
test_tdmodule_dispatch 78.4670μs 35.7923μs 27.9390 KOps/s 29.6051 KOps/s $\textbf{\color{#d91a1a}-5.63\%}$
test_tdseq 39.3030μs 20.6284μs 48.4768 KOps/s 49.6581 KOps/s $\color{#d91a1a}-2.38\%$
test_tdseq_dispatch 67.5260μs 40.5920μs 24.6354 KOps/s 25.3471 KOps/s $\color{#d91a1a}-2.81\%$
test_instantiation_functorch 1.5917ms 1.3388ms 746.9534 Ops/s 733.5448 Ops/s $\color{#35bf28}+1.83\%$
test_instantiation_td 2.4011ms 1.0634ms 940.3430 Ops/s 985.8151 Ops/s $\color{#d91a1a}-4.61\%$
test_exec_functorch 0.2968ms 0.1632ms 6.1289 KOps/s 6.0583 KOps/s $\color{#35bf28}+1.17\%$
test_exec_functional_call 0.2988ms 0.1492ms 6.7005 KOps/s 6.6710 KOps/s $\color{#35bf28}+0.44\%$
test_exec_td 0.3347ms 0.1463ms 6.8368 KOps/s 6.9272 KOps/s $\color{#d91a1a}-1.30\%$
test_exec_td_decorator 0.8110ms 0.2217ms 4.5097 KOps/s 4.5541 KOps/s $\color{#d91a1a}-0.97\%$
test_vmap_mlp_speed[True-True] 0.8276ms 0.4963ms 2.0149 KOps/s 2.0415 KOps/s $\color{#d91a1a}-1.30\%$
test_vmap_mlp_speed[True-False] 0.6937ms 0.4841ms 2.0655 KOps/s 2.0902 KOps/s $\color{#d91a1a}-1.18\%$
test_vmap_mlp_speed[False-True] 0.6863ms 0.3934ms 2.5416 KOps/s 2.5164 KOps/s $\color{#35bf28}+1.00\%$
test_vmap_mlp_speed[False-False] 0.7647ms 0.3943ms 2.5359 KOps/s 2.5121 KOps/s $\color{#35bf28}+0.95\%$
test_vmap_mlp_speed_decorator[True-True] 1.1784ms 0.5636ms 1.7742 KOps/s 1.6579 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_vmap_mlp_speed_decorator[True-False] 1.1094ms 0.5721ms 1.7480 KOps/s 1.7938 KOps/s $\color{#d91a1a}-2.55\%$
test_vmap_mlp_speed_decorator[False-True] 0.9203ms 0.4560ms 2.1929 KOps/s 2.1667 KOps/s $\color{#35bf28}+1.21\%$
test_vmap_mlp_speed_decorator[False-False] 0.6700ms 0.4542ms 2.2018 KOps/s 2.1662 KOps/s $\color{#35bf28}+1.64\%$
test_to_module_speed[True] 2.5935ms 1.7094ms 584.9997 Ops/s 598.5670 Ops/s $\color{#d91a1a}-2.27\%$
test_to_module_speed[False] 2.4291ms 1.6626ms 601.4599 Ops/s 610.4616 Ops/s $\color{#d91a1a}-1.47\%$
test_tc_init 0.1449ms 61.2629μs 16.3231 KOps/s 37.0836 KOps/s $\textbf{\color{#d91a1a}-55.98\%}$
test_tc_init_nested 0.1936ms 0.1217ms 8.2137 KOps/s 18.5482 KOps/s $\textbf{\color{#d91a1a}-55.72\%}$
test_tc_first_layer_tensor 29.9260μs 8.0422μs 124.3438 KOps/s 1.3980 MOps/s $\textbf{\color{#d91a1a}-91.11\%}$
test_tc_first_layer_nontensor 30.6470μs 8.0614μs 124.0484 KOps/s 1.4495 MOps/s $\textbf{\color{#d91a1a}-91.44\%}$
test_tc_second_layer_tensor 21.0900μs 2.5644μs 389.9566 KOps/s 538.6275 KOps/s $\textbf{\color{#d91a1a}-27.60\%}$
test_tc_second_layer_nontensor 40.3150μs 9.1571μs 109.2044 KOps/s 602.4611 KOps/s $\textbf{\color{#d91a1a}-81.87\%}$
test_unbind 9.5144ms 9.1098ms 109.7724 Ops/s 149.4072 Ops/s $\textbf{\color{#d91a1a}-26.53\%}$
test_full_like 17.0028ms 11.7769ms 84.9117 Ops/s 92.4755 Ops/s $\textbf{\color{#d91a1a}-8.18\%}$
test_zeros_like 11.4449ms 5.8282ms 171.5792 Ops/s 171.4434 Ops/s $\color{#35bf28}+0.08\%$
test_ones_like 12.3536ms 6.4631ms 154.7233 Ops/s 159.2340 Ops/s $\color{#d91a1a}-2.83\%$
test_clone 13.4002ms 8.0000ms 125.0007 Ops/s 128.3148 Ops/s $\color{#d91a1a}-2.58\%$
test_squeeze 62.1760μs 13.7219μs 72.8761 KOps/s 70.3017 KOps/s $\color{#35bf28}+3.66\%$
test_unsqueeze 0.1997ms 98.8538μs 10.1160 KOps/s 16.6661 KOps/s $\textbf{\color{#d91a1a}-39.30\%}$
test_split 0.5246ms 0.2871ms 3.4837 KOps/s 9.1405 KOps/s $\textbf{\color{#d91a1a}-61.89\%}$
test_permute 0.3446ms 0.2244ms 4.4560 KOps/s 7.9751 KOps/s $\textbf{\color{#d91a1a}-44.13\%}$
test_stack 26.3374ms 22.4203ms 44.6025 Ops/s 46.7679 Ops/s $\color{#d91a1a}-4.63\%$
test_cat 25.4964ms 21.9432ms 45.5722 Ops/s 46.8564 Ops/s $\color{#d91a1a}-2.74\%$

Copy link

github-actions bot commented Jun 27, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 83.5940μs 13.3420μs 74.9511 KOps/s 76.6086 KOps/s $\color{#d91a1a}-2.16\%$
test_plain_set_stack_nested 29.7910μs 13.5481μs 73.8108 KOps/s 76.3437 KOps/s $\color{#d91a1a}-3.32\%$
test_plain_set_nested_inplace 40.5720μs 14.7421μs 67.8331 KOps/s 68.6475 KOps/s $\color{#d91a1a}-1.19\%$
test_plain_set_stack_nested_inplace 38.9120μs 14.6557μs 68.2327 KOps/s 69.1319 KOps/s $\color{#d91a1a}-1.30\%$
test_items 26.5220μs 4.6395μs 215.5415 KOps/s 210.3198 KOps/s $\color{#35bf28}+2.48\%$
test_items_nested 0.5023ms 0.3403ms 2.9384 KOps/s 2.9197 KOps/s $\color{#35bf28}+0.64\%$
test_items_nested_locked 0.3632ms 0.3417ms 2.9262 KOps/s 2.8938 KOps/s $\color{#35bf28}+1.12\%$
test_items_nested_leaf 0.1129ms 83.2029μs 12.0188 KOps/s 12.0627 KOps/s $\color{#d91a1a}-0.36\%$
test_items_stack_nested 0.3701ms 0.3423ms 2.9213 KOps/s 2.8144 KOps/s $\color{#35bf28}+3.80\%$
test_items_stack_nested_leaf 0.2420ms 82.6237μs 12.1031 KOps/s 11.7513 KOps/s $\color{#35bf28}+2.99\%$
test_items_stack_nested_locked 0.5270ms 0.3426ms 2.9187 KOps/s 2.8685 KOps/s $\color{#35bf28}+1.75\%$
test_keys 15.8910μs 4.3214μs 231.4082 KOps/s 227.8323 KOps/s $\color{#35bf28}+1.57\%$
test_keys_nested 91.3550μs 69.1184μs 14.4679 KOps/s 14.3932 KOps/s $\color{#35bf28}+0.52\%$
test_keys_nested_locked 2.4809ms 74.9701μs 13.3387 KOps/s 13.2720 KOps/s $\color{#35bf28}+0.50\%$
test_keys_nested_leaf 84.4640μs 60.0927μs 16.6410 KOps/s 16.6038 KOps/s $\color{#35bf28}+0.22\%$
test_keys_stack_nested 91.2050μs 69.3658μs 14.4163 KOps/s 14.4591 KOps/s $\color{#d91a1a}-0.30\%$
test_keys_stack_nested_leaf 1.5924ms 59.2897μs 16.8663 KOps/s 16.5063 KOps/s $\color{#35bf28}+2.18\%$
test_keys_stack_nested_locked 0.2184ms 74.6189μs 13.4014 KOps/s 13.2850 KOps/s $\color{#35bf28}+0.88\%$
test_values 11.7473μs 1.8115μs 552.0181 KOps/s 552.8963 KOps/s $\color{#d91a1a}-0.16\%$
test_values_nested 59.8730μs 35.0154μs 28.5589 KOps/s 27.9751 KOps/s $\color{#35bf28}+2.09\%$
test_values_nested_locked 60.2540μs 36.7913μs 27.1803 KOps/s 26.6161 KOps/s $\color{#35bf28}+2.12\%$
test_values_nested_leaf 56.7530μs 31.1705μs 32.0816 KOps/s 31.4205 KOps/s $\color{#35bf28}+2.10\%$
test_values_stack_nested 1.6310ms 35.7403μs 27.9797 KOps/s 27.2109 KOps/s $\color{#35bf28}+2.83\%$
test_values_stack_nested_leaf 1.0727ms 31.6007μs 31.6448 KOps/s 30.6140 KOps/s $\color{#35bf28}+3.37\%$
test_values_stack_nested_locked 0.2154ms 36.6646μs 27.2743 KOps/s 26.2496 KOps/s $\color{#35bf28}+3.90\%$
test_membership 1.4336μs 0.6980μs 1.4326 MOps/s 1.3946 MOps/s $\color{#35bf28}+2.73\%$
test_membership_nested 18.3510μs 2.5514μs 391.9461 KOps/s 386.4174 KOps/s $\color{#35bf28}+1.43\%$
test_membership_nested_leaf 21.9110μs 2.5776μs 387.9527 KOps/s 388.3064 KOps/s $\color{#d91a1a}-0.09\%$
test_membership_stacked_nested 24.6610μs 2.5926μs 385.7164 KOps/s 386.4290 KOps/s $\color{#d91a1a}-0.18\%$
test_membership_stacked_nested_leaf 0.1879ms 2.5692μs 389.2270 KOps/s 387.4450 KOps/s $\color{#35bf28}+0.46\%$
test_membership_nested_last 0.1839ms 3.1086μs 321.6866 KOps/s 317.3228 KOps/s $\color{#35bf28}+1.38\%$
test_membership_nested_leaf_last 16.7210μs 3.1258μs 319.9190 KOps/s 320.8717 KOps/s $\color{#d91a1a}-0.30\%$
test_membership_stacked_nested_last 30.0920μs 3.0831μs 324.3506 KOps/s 281.9570 KOps/s $\textbf{\color{#35bf28}+15.04\%}$
test_membership_stacked_nested_leaf_last 19.2110μs 3.1306μs 319.4254 KOps/s 276.8614 KOps/s $\textbf{\color{#35bf28}+15.37\%}$
test_nested_getleaf 23.9210μs 8.3276μs 120.0821 KOps/s 119.0976 KOps/s $\color{#35bf28}+0.83\%$
test_nested_get 35.8820μs 7.8534μs 127.3335 KOps/s 127.6502 KOps/s $\color{#d91a1a}-0.25\%$
test_stacked_getleaf 31.2220μs 8.3428μs 119.8636 KOps/s 119.6140 KOps/s $\color{#35bf28}+0.21\%$
test_stacked_get 35.5120μs 7.8575μs 127.2676 KOps/s 127.4493 KOps/s $\color{#d91a1a}-0.14\%$
test_nested_getitemleaf 25.5320μs 8.5472μs 116.9980 KOps/s 116.6315 KOps/s $\color{#35bf28}+0.31\%$
test_nested_getitem 25.7020μs 8.0162μs 124.7473 KOps/s 124.8072 KOps/s $\color{#d91a1a}-0.05\%$
test_stacked_getitemleaf 24.8810μs 8.5101μs 117.5069 KOps/s 116.9474 KOps/s $\color{#35bf28}+0.48\%$
test_stacked_getitem 44.6630μs 8.0544μs 124.1561 KOps/s 125.2819 KOps/s $\color{#d91a1a}-0.90\%$
test_lock_nested 59.2302ms 0.3920ms 2.5510 KOps/s 2.4814 KOps/s $\color{#35bf28}+2.80\%$
test_lock_stack_nested 0.4201ms 0.2912ms 3.4342 KOps/s 3.3644 KOps/s $\color{#35bf28}+2.07\%$
test_unlock_nested 63.6856ms 0.4005ms 2.4967 KOps/s 2.4777 KOps/s $\color{#35bf28}+0.77\%$
test_unlock_stack_nested 0.4748ms 0.3017ms 3.3148 KOps/s 3.2951 KOps/s $\color{#35bf28}+0.60\%$
test_flatten_speed 0.3522ms 0.1017ms 9.8314 KOps/s 9.7816 KOps/s $\color{#35bf28}+0.51\%$
test_unflatten_speed 0.3518ms 0.2924ms 3.4195 KOps/s 3.4300 KOps/s $\color{#d91a1a}-0.31\%$
test_common_ops 1.0714ms 0.5902ms 1.6942 KOps/s 1.6665 KOps/s $\color{#35bf28}+1.66\%$
test_creation 18.5410μs 1.6092μs 621.4240 KOps/s 629.5337 KOps/s $\color{#d91a1a}-1.29\%$
test_creation_empty 26.2520μs 9.6525μs 103.5997 KOps/s 112.3672 KOps/s $\textbf{\color{#d91a1a}-7.80\%}$
test_creation_nested_1 33.0220μs 11.4197μs 87.5681 KOps/s 93.3699 KOps/s $\textbf{\color{#d91a1a}-6.21\%}$
test_creation_nested_2 50.7530μs 13.6839μs 73.0787 KOps/s 77.1406 KOps/s $\textbf{\color{#d91a1a}-5.27\%}$
test_clone 74.1250μs 11.1636μs 89.5771 KOps/s 87.8780 KOps/s $\color{#35bf28}+1.93\%$
test_getitem[int] 24.9310μs 10.5339μs 94.9317 KOps/s 93.6430 KOps/s $\color{#35bf28}+1.38\%$
test_getitem[slice_int] 0.1014ms 20.1061μs 49.7362 KOps/s 49.6674 KOps/s $\color{#35bf28}+0.14\%$
test_getitem[range] 62.1930μs 44.7506μs 22.3461 KOps/s 21.8470 KOps/s $\color{#35bf28}+2.28\%$
test_getitem[tuple] 41.5720μs 18.1193μs 55.1898 KOps/s 54.0100 KOps/s $\color{#35bf28}+2.18\%$
test_getitem[list] 0.1274ms 31.3908μs 31.8565 KOps/s 29.3418 KOps/s $\textbf{\color{#35bf28}+8.57\%}$
test_setitem_dim[int] 45.8930μs 27.0802μs 36.9273 KOps/s 32.1155 KOps/s $\textbf{\color{#35bf28}+14.98\%}$
test_setitem_dim[slice_int] 64.8940μs 47.7336μs 20.9496 KOps/s 18.9180 KOps/s $\textbf{\color{#35bf28}+10.74\%}$
test_setitem_dim[range] 94.7860μs 64.3461μs 15.5410 KOps/s 13.9895 KOps/s $\textbf{\color{#35bf28}+11.09\%}$
test_setitem_dim[tuple] 63.1540μs 41.6031μs 24.0367 KOps/s 22.0126 KOps/s $\textbf{\color{#35bf28}+9.19\%}$
test_setitem 59.3830μs 16.6870μs 59.9269 KOps/s 58.6963 KOps/s $\color{#35bf28}+2.10\%$
test_set 53.5530μs 15.9604μs 62.6550 KOps/s 59.8542 KOps/s $\color{#35bf28}+4.68\%$
test_set_shared 1.5734ms 96.0633μs 10.4098 KOps/s 10.1902 KOps/s $\color{#35bf28}+2.16\%$
test_update 85.7450μs 19.0549μs 52.4798 KOps/s 54.6935 KOps/s $\color{#d91a1a}-4.05\%$
test_update_nested 77.5040μs 24.3187μs 41.1206 KOps/s 41.5992 KOps/s $\color{#d91a1a}-1.15\%$
test_update__nested 51.7730μs 21.2699μs 47.0147 KOps/s 46.6045 KOps/s $\color{#35bf28}+0.88\%$
test_set_nested 80.1150μs 16.7806μs 59.5926 KOps/s 58.9421 KOps/s $\color{#35bf28}+1.10\%$
test_set_nested_new 94.1250μs 19.3417μs 51.7017 KOps/s 51.7346 KOps/s $\color{#d91a1a}-0.06\%$
test_select 0.1170ms 31.9984μs 31.2516 KOps/s 30.2506 KOps/s $\color{#35bf28}+3.31\%$
test_select_nested 0.8826ms 52.7662μs 18.9515 KOps/s 19.3750 KOps/s $\color{#d91a1a}-2.19\%$
test_exclude_nested 0.1846ms 0.1088ms 9.1916 KOps/s 9.2969 KOps/s $\color{#d91a1a}-1.13\%$
test_empty[True] 0.3818ms 0.3399ms 2.9418 KOps/s 2.9217 KOps/s $\color{#35bf28}+0.69\%$
test_empty[False] 2.3962μs 0.8294μs 1.2057 MOps/s 1.2340 MOps/s $\color{#d91a1a}-2.29\%$
test_to 88.5750μs 58.1307μs 17.2026 KOps/s 15.4066 KOps/s $\textbf{\color{#35bf28}+11.66\%}$
test_to_nonblocking 0.1991ms 34.7957μs 28.7392 KOps/s 28.2022 KOps/s $\color{#35bf28}+1.90\%$
test_unbind_speed 0.3910ms 0.2534ms 3.9465 KOps/s 3.8934 KOps/s $\color{#35bf28}+1.36\%$
test_unbind_speed_stack0 0.3766ms 0.2521ms 3.9666 KOps/s 3.8642 KOps/s $\color{#35bf28}+2.65\%$
test_unbind_speed_stack1 77.3432ms 0.8259ms 1.2108 KOps/s 1.2642 KOps/s $\color{#d91a1a}-4.22\%$
test_split 79.3484ms 1.6371ms 610.8465 Ops/s 607.3634 Ops/s $\color{#35bf28}+0.57\%$
test_chunk 1.5707ms 1.5151ms 660.0074 Ops/s 608.5911 Ops/s $\textbf{\color{#35bf28}+8.45\%}$
test_creation[device0] 0.2011ms 56.3537μs 17.7451 KOps/s 17.6017 KOps/s $\color{#35bf28}+0.81\%$
test_creation_from_tensor 0.1951ms 51.8667μs 19.2802 KOps/s 19.0180 KOps/s $\color{#35bf28}+1.38\%$
test_add_one[memmap_tensor0] 0.1241ms 6.7076μs 149.0843 KOps/s 149.0900 KOps/s $-0.00\%$
test_contiguous[memmap_tensor0] 12.5310μs 0.6556μs 1.5253 MOps/s 1.5612 MOps/s $\color{#d91a1a}-2.30\%$
test_stack[memmap_tensor0] 39.6220μs 4.7566μs 210.2330 KOps/s 211.3520 KOps/s $\color{#d91a1a}-0.53\%$
test_memmaptd_index 1.0964ms 0.2645ms 3.7803 KOps/s 3.6795 KOps/s $\color{#35bf28}+2.74\%$
test_memmaptd_index_astensor 0.5904ms 0.3267ms 3.0606 KOps/s 3.0081 KOps/s $\color{#35bf28}+1.75\%$
test_memmaptd_index_op 0.9201ms 0.6359ms 1.5725 KOps/s 1.5893 KOps/s $\color{#d91a1a}-1.06\%$
test_serialize_model 0.1814s 0.1006s 9.9402 Ops/s 10.0370 Ops/s $\color{#d91a1a}-0.96\%$
test_serialize_model_pickle 1.3509s 1.2361s 0.8090 Ops/s 0.8082 Ops/s $\color{#35bf28}+0.10\%$
test_serialize_weights 0.1713s 98.2388ms 10.1793 Ops/s 9.2696 Ops/s $\textbf{\color{#35bf28}+9.81\%}$
test_serialize_weights_returnearly 69.1916ms 61.9161ms 16.1509 Ops/s 11.8837 Ops/s $\textbf{\color{#35bf28}+35.91\%}$
test_serialize_weights_pickle 1.3513s 1.2370s 0.8084 Ops/s 0.8085 Ops/s $-0.00\%$
test_reshape_pytree 0.1594ms 26.2666μs 38.0711 KOps/s 37.6967 KOps/s $\color{#35bf28}+0.99\%$
test_reshape_td 0.2229ms 31.5861μs 31.6595 KOps/s 31.3393 KOps/s $\color{#35bf28}+1.02\%$
test_view_pytree 0.1659ms 26.0587μs 38.3749 KOps/s 38.4750 KOps/s $\color{#d91a1a}-0.26\%$
test_view_td 0.2593ms 37.7481μs 26.4914 KOps/s 26.9158 KOps/s $\color{#d91a1a}-1.58\%$
test_unbind_pytree 0.2503ms 32.9264μs 30.3707 KOps/s 31.0321 KOps/s $\color{#d91a1a}-2.13\%$
test_unbind_td 0.5297ms 40.4303μs 24.7339 KOps/s 25.5934 KOps/s $\color{#d91a1a}-3.36\%$
test_split_pytree 0.1403ms 34.5043μs 28.9819 KOps/s 28.3911 KOps/s $\color{#35bf28}+2.08\%$
test_split_td 0.1642ms 38.0056μs 26.3119 KOps/s 26.2696 KOps/s $\color{#35bf28}+0.16\%$
test_add_pytree 0.2624ms 37.9798μs 26.3298 KOps/s 26.1495 KOps/s $\color{#35bf28}+0.69\%$
test_add_td 0.2105ms 54.0030μs 18.5175 KOps/s 18.2552 KOps/s $\color{#35bf28}+1.44\%$
test_distributed 1.9637ms 71.9506μs 13.8984 KOps/s 14.0026 KOps/s $\color{#d91a1a}-0.74\%$
test_tdmodule 0.1472ms 15.7593μs 63.4547 KOps/s 61.7103 KOps/s $\color{#35bf28}+2.83\%$
test_tdmodule_dispatch 54.7420μs 29.9157μs 33.4273 KOps/s 32.2993 KOps/s $\color{#35bf28}+3.49\%$
test_tdseq 33.6020μs 17.1018μs 58.4732 KOps/s 59.5634 KOps/s $\color{#d91a1a}-1.83\%$
test_tdseq_dispatch 54.4930μs 33.3972μs 29.9427 KOps/s 29.5736 KOps/s $\color{#35bf28}+1.25\%$
test_instantiation_functorch 1.5573ms 1.3794ms 724.9712 Ops/s 712.8947 Ops/s $\color{#35bf28}+1.69\%$
test_instantiation_td 1.4790ms 0.9750ms 1.0256 KOps/s 1.0172 KOps/s $\color{#35bf28}+0.82\%$
test_exec_functorch 0.1875ms 0.1418ms 7.0536 KOps/s 7.0809 KOps/s $\color{#d91a1a}-0.39\%$
test_exec_functional_call 0.2585ms 0.1312ms 7.6205 KOps/s 7.7452 KOps/s $\color{#d91a1a}-1.61\%$
test_exec_td 0.3189ms 0.1273ms 7.8562 KOps/s 7.5436 KOps/s $\color{#35bf28}+4.14\%$
test_exec_td_decorator 0.4498ms 0.1991ms 5.0217 KOps/s 4.9567 KOps/s $\color{#35bf28}+1.31\%$
test_vmap_mlp_speed[True-True] 1.8309ms 0.5671ms 1.7633 KOps/s 1.7815 KOps/s $\color{#d91a1a}-1.02\%$
test_vmap_mlp_speed[True-False] 0.7423ms 0.5814ms 1.7200 KOps/s 1.7808 KOps/s $\color{#d91a1a}-3.42\%$
test_vmap_mlp_speed[False-True] 0.6656ms 0.5086ms 1.9662 KOps/s 2.0194 KOps/s $\color{#d91a1a}-2.64\%$
test_vmap_mlp_speed[False-False] 0.6876ms 0.5057ms 1.9775 KOps/s 1.9283 KOps/s $\color{#35bf28}+2.55\%$
test_vmap_mlp_speed_decorator[True-True] 0.9839ms 0.6265ms 1.5961 KOps/s 1.5829 KOps/s $\color{#35bf28}+0.84\%$
test_vmap_mlp_speed_decorator[True-False] 0.8126ms 0.6252ms 1.5996 KOps/s 1.5730 KOps/s $\color{#35bf28}+1.69\%$
test_vmap_mlp_speed_decorator[False-True] 0.6961ms 0.5457ms 1.8324 KOps/s 1.6876 KOps/s $\textbf{\color{#35bf28}+8.58\%}$
test_vmap_mlp_speed_decorator[False-False] 0.7059ms 0.5481ms 1.8244 KOps/s 1.6945 KOps/s $\textbf{\color{#35bf28}+7.67\%}$
test_vmap_transformer_speed[True-True] 7.8344ms 7.3893ms 135.3316 Ops/s 132.2657 Ops/s $\color{#35bf28}+2.32\%$
test_vmap_transformer_speed[True-False] 7.6326ms 7.3311ms 136.4049 Ops/s 132.4202 Ops/s $\color{#35bf28}+3.01\%$
test_vmap_transformer_speed[False-True] 7.6277ms 7.2903ms 137.1695 Ops/s 134.5431 Ops/s $\color{#35bf28}+1.95\%$
test_vmap_transformer_speed[False-False] 8.0754ms 7.5302ms 132.7986 Ops/s 137.4430 Ops/s $\color{#d91a1a}-3.38\%$
test_vmap_transformer_speed_decorator[True-True] 18.2331ms 17.8807ms 55.9263 Ops/s 56.3909 Ops/s $\color{#d91a1a}-0.82\%$
test_vmap_transformer_speed_decorator[True-False] 18.1972ms 17.8339ms 56.0731 Ops/s 56.4931 Ops/s $\color{#d91a1a}-0.74\%$
test_vmap_transformer_speed_decorator[False-True] 18.6209ms 17.8002ms 56.1792 Ops/s 56.4649 Ops/s $\color{#d91a1a}-0.51\%$
test_vmap_transformer_speed_decorator[False-False] 18.2207ms 17.7779ms 56.2497 Ops/s 56.5496 Ops/s $\color{#d91a1a}-0.53\%$
test_to_module_speed[True] 1.6278ms 1.4918ms 670.3377 Ops/s 681.9274 Ops/s $\color{#d91a1a}-1.70\%$
test_to_module_speed[False] 1.5930ms 1.4763ms 677.3470 Ops/s 692.3180 Ops/s $\color{#d91a1a}-2.16\%$
test_tc_init 0.1791ms 54.0740μs 18.4932 KOps/s 38.6656 KOps/s $\textbf{\color{#d91a1a}-52.17\%}$
test_tc_init_nested 0.2319ms 0.1059ms 9.4422 KOps/s 18.2337 KOps/s $\textbf{\color{#d91a1a}-48.22\%}$
test_tc_first_layer_tensor 24.6220μs 3.6935μs 270.7471 KOps/s 2.7938 MOps/s $\textbf{\color{#d91a1a}-90.31\%}$
test_tc_first_layer_nontensor 17.9310μs 3.7064μs 269.8072 KOps/s 2.6310 MOps/s $\textbf{\color{#d91a1a}-89.75\%}$
test_tc_second_layer_tensor 5.8902μs 1.1778μs 849.0588 KOps/s 1.0264 MOps/s $\textbf{\color{#d91a1a}-17.28\%}$
test_tc_second_layer_nontensor 19.5810μs 4.2161μs 237.1879 KOps/s 1.2070 MOps/s $\textbf{\color{#d91a1a}-80.35\%}$
test_unbind 0.1126s 13.6666ms 73.1710 Ops/s 121.0664 Ops/s $\textbf{\color{#d91a1a}-39.56\%}$
test_full_like 11.4609ms 10.2053ms 97.9887 Ops/s 83.2254 Ops/s $\textbf{\color{#35bf28}+17.74\%}$
test_zeros_like 8.4690ms 8.0499ms 124.2254 Ops/s 124.7202 Ops/s $\color{#d91a1a}-0.40\%$
test_ones_like 8.8898ms 8.1320ms 122.9707 Ops/s 123.5628 Ops/s $\color{#d91a1a}-0.48\%$
test_clone 10.9343ms 10.1465ms 98.5561 Ops/s 101.9487 Ops/s $\color{#d91a1a}-3.33\%$
test_squeeze 0.1882ms 10.9110μs 91.6506 KOps/s 86.8160 KOps/s $\textbf{\color{#35bf28}+5.57\%}$
test_unsqueeze 0.2569ms 87.4934μs 11.4294 KOps/s 18.6366 KOps/s $\textbf{\color{#d91a1a}-38.67\%}$
test_split 0.1043s 3.4961ms 286.0350 Ops/s 10.0738 KOps/s $\textbf{\color{#d91a1a}-97.16\%}$
test_permute 0.3721ms 0.1984ms 5.0392 KOps/s 9.1877 KOps/s $\textbf{\color{#d91a1a}-45.15\%}$
test_stack 30.1227ms 29.2468ms 34.1918 Ops/s 34.6668 Ops/s $\color{#d91a1a}-1.37\%$
test_cat 31.3382ms 28.8500ms 34.6620 Ops/s 34.6503 Ops/s $\color{#35bf28}+0.03\%$

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 27, 2024
@vmoens vmoens merged commit 5c34868 into main Jul 3, 2024
41 of 43 checks passed
@vmoens vmoens deleted the nontensordata-as-leaves branch October 21, 2024 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BC-breaking CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Tensorclass.key() doesn't list non-tensor data.
2 participants