Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Refactor c++ binaries location #860

Merged
merged 12 commits into from
Jul 9, 2024
Merged

[Refactor] Refactor c++ binaries location #860

merged 12 commits into from
Jul 9, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jul 8, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 8, 2024
Copy link

github-actions bot commented Jul 8, 2024

Result of GPU Benchmark Tests

Expand to view detailed results
Name Max Mean Ops
test_plain_set_nested 32.9400μs 12.4228μs 80.4970 KOps/s
test_plain_set_stack_nested 28.1200μs 12.3762μs 80.8004 KOps/s
test_plain_set_nested_inplace 30.6610μs 13.7955μs 72.4874 KOps/s
test_plain_set_stack_nested_inplace 96.9120μs 13.6496μs 73.2623 KOps/s
test_items 17.2010μs 4.7380μs 211.0580 KOps/s
test_items_nested 0.3690ms 0.3361ms 2.9749 KOps/s
test_items_nested_locked 0.4056ms 0.3377ms 2.9610 KOps/s
test_items_nested_leaf 98.7320μs 83.0426μs 12.0420 KOps/s
test_items_stack_nested 0.3882ms 0.3360ms 2.9764 KOps/s
test_items_stack_nested_leaf 0.1055ms 83.2742μs 12.0085 KOps/s
test_items_stack_nested_locked 0.3912ms 0.3392ms 2.9481 KOps/s
test_keys 17.4100μs 4.3784μs 228.3925 KOps/s
test_keys_nested 85.9320μs 68.7502μs 14.5454 KOps/s
test_keys_nested_locked 0.7589ms 73.9602μs 13.5208 KOps/s
test_keys_nested_leaf 0.2229ms 59.1708μs 16.9002 KOps/s
test_keys_stack_nested 0.2562ms 67.3715μs 14.8431 KOps/s
test_keys_stack_nested_leaf 0.2471ms 59.1823μs 16.8969 KOps/s
test_keys_stack_nested_locked 0.1521ms 74.0975μs 13.4957 KOps/s
test_values 6.5237μs 1.7965μs 556.6490 KOps/s
test_values_nested 57.8810μs 35.0736μs 28.5115 KOps/s
test_values_nested_locked 59.2410μs 36.8426μs 27.1425 KOps/s
test_values_nested_leaf 49.1100μs 31.1584μs 32.0940 KOps/s
test_values_stack_nested 53.0310μs 35.3014μs 28.3275 KOps/s
test_values_stack_nested_leaf 52.9310μs 31.4054μs 31.8416 KOps/s
test_values_stack_nested_locked 60.7910μs 37.2931μs 26.8146 KOps/s
test_membership 24.8347μs 0.7169μs 1.3950 MOps/s
test_membership_nested 15.3700μs 2.5519μs 391.8607 KOps/s
test_membership_nested_leaf 0.2018ms 2.5731μs 388.6415 KOps/s
test_membership_stacked_nested 24.3410μs 2.5978μs 384.9353 KOps/s
test_membership_stacked_nested_leaf 0.1662ms 2.6148μs 382.4445 KOps/s
test_membership_nested_last 17.8200μs 3.0834μs 324.3139 KOps/s
test_membership_nested_leaf_last 0.1911ms 3.1013μs 322.4491 KOps/s
test_membership_stacked_nested_last 25.6510μs 3.0737μs 325.3430 KOps/s
test_membership_stacked_nested_leaf_last 0.1403ms 3.0811μs 324.5598 KOps/s
test_nested_getleaf 25.1300μs 8.3889μs 119.2045 KOps/s
test_nested_get 23.3400μs 7.8592μs 127.2399 KOps/s
test_stacked_getleaf 27.0300μs 8.3511μs 119.7445 KOps/s
test_stacked_get 24.8500μs 7.8545μs 127.3155 KOps/s
test_nested_getitemleaf 30.2610μs 8.5550μs 116.8905 KOps/s
test_nested_getitem 24.1510μs 8.0200μs 124.6880 KOps/s
test_stacked_getitemleaf 26.9600μs 8.5682μs 116.7108 KOps/s
test_stacked_getitem 27.1610μs 7.9591μs 125.6428 KOps/s
test_lock_nested 60.1804ms 0.4026ms 2.4841 KOps/s
test_lock_stack_nested 0.4366ms 0.2950ms 3.3901 KOps/s
test_unlock_nested 63.2105ms 0.4008ms 2.4952 KOps/s
test_unlock_stack_nested 0.3800ms 0.3021ms 3.3106 KOps/s
test_flatten_speed 0.4934ms 0.1018ms 9.8199 KOps/s
test_unflatten_speed 0.3144ms 0.2884ms 3.4671 KOps/s
test_common_ops 1.0400ms 0.5745ms 1.7406 KOps/s
test_creation 18.4100μs 1.5834μs 631.5461 KOps/s
test_creation_empty 23.7510μs 7.7941μs 128.3028 KOps/s
test_creation_nested_1 24.4610μs 9.5926μs 104.2470 KOps/s
test_creation_nested_2 90.8620μs 11.8631μs 84.2952 KOps/s
test_clone 0.1967ms 11.7549μs 85.0707 KOps/s
test_getitem[int] 47.3710μs 10.3583μs 96.5409 KOps/s
test_getitem[slice_int] 0.1488ms 20.3136μs 49.2280 KOps/s
test_getitem[range] 67.7910μs 47.6750μs 20.9754 KOps/s
test_getitem[tuple] 45.6700μs 18.3308μs 54.5531 KOps/s
test_getitem[list] 0.2023ms 36.5513μs 27.3588 KOps/s
test_setitem_dim[int] 55.1710μs 27.7850μs 35.9907 KOps/s
test_setitem_dim[slice_int] 0.1716ms 50.4463μs 19.8230 KOps/s
test_setitem_dim[range] 0.2141ms 70.1227μs 14.2607 KOps/s
test_setitem_dim[tuple] 66.4210μs 43.8529μs 22.8035 KOps/s
test_setitem 0.1873ms 16.3312μs 61.2324 KOps/s
test_set 0.1626ms 16.4850μs 60.6612 KOps/s
test_set_shared 1.8120ms 0.1008ms 9.9237 KOps/s
test_update 65.8310μs 18.0992μs 55.2511 KOps/s
test_update_nested 0.2222ms 23.4843μs 42.5817 KOps/s
test_update__nested 0.1370ms 22.6891μs 44.0739 KOps/s
test_set_nested 60.5610μs 16.5338μs 60.4821 KOps/s
test_set_nested_new 0.2173ms 19.7630μs 50.5995 KOps/s
test_select 0.2170ms 33.9359μs 29.4673 KOps/s
test_select_nested 0.2122ms 51.2794μs 19.5010 KOps/s
test_exclude_nested 0.1701ms 0.1059ms 9.4389 KOps/s
test_empty[True] 0.4117ms 0.3459ms 2.8910 KOps/s
test_empty[False] 2.1580μs 0.7982μs 1.2528 MOps/s
test_to 93.0920μs 62.8300μs 15.9160 KOps/s
test_to_nonblocking 0.1909ms 38.5891μs 25.9141 KOps/s
test_unbind_speed 1.7243ms 0.2563ms 3.9014 KOps/s
test_unbind_speed_stack0 0.4108ms 0.2564ms 3.9007 KOps/s
test_unbind_speed_stack1 78.2706ms 0.7791ms 1.2836 KOps/s
test_split 78.9491ms 1.6360ms 611.2368 Ops/s
test_chunk 80.0125ms 1.6346ms 611.7662 Ops/s
test_creation[device0] 0.1983ms 57.9689μs 17.2506 KOps/s
test_creation_from_tensor 0.2510ms 58.4845μs 17.0985 KOps/s
test_add_one[memmap_tensor0] 77.8420μs 7.7243μs 129.4615 KOps/s
test_contiguous[memmap_tensor0] 14.6110μs 0.6769μs 1.4774 MOps/s
test_stack[memmap_tensor0] 0.2007ms 4.8975μs 204.1873 KOps/s
test_memmaptd_index 1.0050ms 0.2704ms 3.6977 KOps/s
test_memmaptd_index_astensor 0.7034ms 0.3351ms 2.9844 KOps/s
test_memmaptd_index_op 0.9197ms 0.6369ms 1.5701 KOps/s
test_serialize_model 0.1015s 96.4216ms 10.3711 Ops/s
test_serialize_model_pickle 1.6704s 1.3567s 0.7371 Ops/s
test_serialize_weights 0.1778s 0.1052s 9.5047 Ops/s
test_serialize_weights_returnearly 0.2614s 74.6163ms 13.4019 Ops/s
test_serialize_weights_pickle 1.4180s 1.2480s 0.8013 Ops/s
test_reshape_pytree 0.1561ms 26.1727μs 38.2077 KOps/s
test_reshape_td 0.2189ms 31.1906μs 32.0609 KOps/s
test_view_pytree 0.2381ms 26.0499μs 38.3878 KOps/s
test_view_td 0.1747ms 35.8734μs 27.8758 KOps/s
test_unbind_pytree 0.2373ms 31.4751μs 31.7711 KOps/s
test_unbind_td 0.3964ms 38.4140μs 26.0322 KOps/s
test_split_pytree 0.1508ms 33.7103μs 29.6645 KOps/s
test_split_td 0.4009ms 37.3172μs 26.7973 KOps/s
test_add_pytree 0.1462ms 39.5784μs 25.2663 KOps/s
test_add_td 0.2707ms 49.8967μs 20.0414 KOps/s
test_distributed 23.8753ms 96.1761μs 10.3976 KOps/s
test_tdmodule 51.0910μs 14.1408μs 70.7176 KOps/s
test_tdmodule_dispatch 0.1187ms 27.6095μs 36.2194 KOps/s
test_tdseq 31.4510μs 15.5202μs 64.4322 KOps/s
test_tdseq_dispatch 47.6010μs 30.3740μs 32.9229 KOps/s
test_instantiation_functorch 1.5523ms 1.3976ms 715.5352 Ops/s
test_instantiation_td 82.6952ms 1.0570ms 946.0741 Ops/s
test_exec_functorch 0.2592ms 0.1479ms 6.7613 KOps/s
test_exec_functional_call 0.2848ms 0.1381ms 7.2388 KOps/s
test_exec_td 0.2804ms 0.1350ms 7.4083 KOps/s
test_exec_td_decorator 0.3241ms 0.2067ms 4.8372 KOps/s
test_vmap_mlp_speed[True-True] 0.6737ms 0.6002ms 1.6661 KOps/s
test_vmap_mlp_speed[True-False] 0.7738ms 0.5752ms 1.7384 KOps/s
test_vmap_mlp_speed[False-True] 0.6798ms 0.5060ms 1.9763 KOps/s
test_vmap_mlp_speed[False-False] 0.6915ms 0.5066ms 1.9741 KOps/s
test_vmap_mlp_speed_decorator[True-True] 1.0507ms 0.6350ms 1.5747 KOps/s
test_vmap_mlp_speed_decorator[True-False] 0.9125ms 0.6365ms 1.5712 KOps/s
test_vmap_mlp_speed_decorator[False-True] 0.7359ms 0.5648ms 1.7706 KOps/s
test_vmap_mlp_speed_decorator[False-False] 0.7350ms 0.5652ms 1.7693 KOps/s
test_vmap_transformer_speed[True-True] 7.8928ms 7.6422ms 130.8526 Ops/s
test_vmap_transformer_speed[True-False] 7.7938ms 7.6158ms 131.3066 Ops/s
test_vmap_transformer_speed[False-True] 7.7514ms 7.5233ms 132.9200 Ops/s
test_vmap_transformer_speed[False-False] 7.7632ms 7.5369ms 132.6798 Ops/s
test_vmap_transformer_speed_decorator[True-True] 18.9672ms 18.5207ms 53.9936 Ops/s
test_vmap_transformer_speed_decorator[True-False] 18.6126ms 18.4498ms 54.2011 Ops/s
test_vmap_transformer_speed_decorator[False-True] 18.9429ms 18.3335ms 54.5449 Ops/s
test_vmap_transformer_speed_decorator[False-False] 18.4730ms 18.3057ms 54.6279 Ops/s
test_to_module_speed[True] 1.6097ms 1.4742ms 678.3446 Ops/s
test_to_module_speed[False] 1.5886ms 1.4513ms 689.0526 Ops/s
test_tc_init 0.1648ms 52.1732μs 19.1669 KOps/s
test_tc_init_nested 0.3043ms 0.1009ms 9.9068 KOps/s
test_tc_first_layer_tensor 17.2010μs 3.7062μs 269.8153 KOps/s
test_tc_first_layer_nontensor 18.4900μs 3.7348μs 267.7524 KOps/s
test_tc_second_layer_tensor 7.4075μs 1.1988μs 834.1517 KOps/s
test_tc_second_layer_nontensor 19.9100μs 4.3045μs 232.3150 KOps/s
test_unbind 0.1148s 14.2394ms 70.2276 Ops/s
test_full_like 14.7193ms 14.0535ms 71.1568 Ops/s
test_zeros_like 8.4960ms 8.0352ms 124.4519 Ops/s
test_ones_like 8.7433ms 8.1762ms 122.3057 Ops/s
test_clone 10.5307ms 10.0531ms 99.4715 Ops/s
test_squeeze 64.7410μs 10.3069μs 97.0224 KOps/s
test_unsqueeze 0.2122ms 83.3472μs 11.9980 KOps/s
test_split 3.3571ms 3.0757ms 325.1319 Ops/s
test_permute 0.2553ms 0.1975ms 5.0628 KOps/s
test_stack 29.7789ms 28.8299ms 34.6862 Ops/s
test_cat 31.8259ms 28.7951ms 34.7281 Ops/s

@vmoens vmoens added the Refactor Refactoring code - not a new feature label Jul 9, 2024
@vmoens vmoens merged commit 612fbbc into main Jul 9, 2024
39 of 45 checks passed
@vmoens vmoens deleted the fix-imports branch July 9, 2024 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants