Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement some missing element wise Add/Sub/Mul/Div/Neg operations for CPU and CUDA EPs #23090

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Zyrin
Copy link

@Zyrin Zyrin commented Dec 12, 2024

Description

  • [CPU EP] Implement Add/Sub/Mul/Div element wise operations for (u)int8, (u)int16, uint32 and uint64.
  • [CPU EP] Implement Neg unary operation for int16
  • [CUDA EP] Implement Add/Sub/Mul/Div element wise operations for (u)int8 and (u)int16

Motivation and Context

This solves #23051

@tianleiwu
Copy link
Contributor

This will increase binary size. Is the missing type used in any real model?

@Zyrin
Copy link
Author

Zyrin commented Dec 12, 2024

@microsoft-github-policy-service agree company="Cellumation"

@Zyrin
Copy link
Author

Zyrin commented Dec 12, 2024

I do not know if any "real" model use these types for these operations.
I tried to use uint8 operations in my model and found that onnxruntime was not supporting them, although the onnx api documentation has support for them. So I just went ahead and implemented all these missing types.
The binary size grows by <0.6% for libonnxruntime.so and <0.3% for libonnxruntime_providers_cuda.so.

@xadupre
Copy link
Member

xadupre commented Dec 17, 2024

This may increase the binary size. +@scottmckay

@tianleiwu
Copy link
Contributor

tianleiwu commented Dec 17, 2024

@Zyrin, please following https://github.com/microsoft/onnxruntime/blob/main/docs/Coding_Conventions_and_Standards.md#linting to format code.

Also need update documents (You can find the updated documents in artifacts of Windows GPU Doc Gen CI Pipeline from Checks).

@tianleiwu
Copy link
Contributor

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

@tianleiwu
Copy link
Contributor

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-linux-gpu-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline

@tianleiwu
Copy link
Contributor

/azp run iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,CoreML CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

Copy link

Azure Pipelines successfully started running 6 pipeline(s).

Copy link

Azure Pipelines successfully started running 9 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@Zyrin Zyrin force-pushed the main branch 2 times, most recently from 7fcfd3b to 92d0502 Compare December 18, 2024 08:58
@Zyrin
Copy link
Author

Zyrin commented Dec 18, 2024

I applied the linting fixes. @tianleiwu could you restart the pipelines?

@tianleiwu
Copy link
Contributor

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

@tianleiwu
Copy link
Contributor

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

@tianleiwu
Copy link
Contributor

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,CoreML CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

Copy link

Azure Pipelines successfully started running 7 pipeline(s).

Copy link

Azure Pipelines successfully started running 8 pipeline(s).

Copy link

Azure Pipelines successfully started running 10 pipeline(s).

@tianleiwu
Copy link
Contributor

@Zyrin, there are some build pipeline failed. You need update the unit tests to run on cuda and cpu provider only. See some examples in the same test file.

You will also need update operator documents (you can get them from artifacts of Windows GPU Doc Gen CI Pipeline).

@Zyrin
Copy link
Author

Zyrin commented Jan 8, 2025

@tianleiwu I assume you want me to only run the tests on the CPU and CUDA EPs like with the following code snipped from element_wise_ops_test.cc:1837:

if (nullptr != DefaultCpuExecutionProvider()) {
  std::vector<std::unique_ptr<IExecutionProvider>> execution_providers;
  execution_providers.push_back(DefaultCpuExecutionProvider());
  test.Run(OpTester::ExpectResult::kExpectSuccess, "", {}, nullptr, &execution_providers);
}
if (nullptr != DefaultCudaExecutionProvider()) {
  std::vector<std::unique_ptr<IExecutionProvider>> execution_providers;
  execution_providers.push_back(DefaultCudaExecutionProvider());
  test.Run(OpTester::ExpectResult::kExpectSuccess, "", {}, nullptr, &execution_providers);
}

Alternatively I could exclude the TensorRT and DNNL EPs, but I do not know if there are EPs that are not tested here, and thus would fail on someone else.

On that account should I only change the failing tests or all the tests I added?

@tianleiwu
Copy link
Contributor

@tianleiwu I assume you want me to only run the tests on the CPU and CUDA EPs like with the following code snipped from element_wise_ops_test.cc:1837:

Right. You can follow the code snippet to fix failing tests that is introduced by this.

…(u)int8, (u)int16, uint32 and uint64 as well as Neg unary operation for int16 on CPU EP and implement Add/Sub/Mul/Div element wise operations for (u)int8 and (u)int16 on CUDA EP
@Zyrin
Copy link
Author

Zyrin commented Jan 9, 2025

I fixed the tests. Is there a way for me to generate the docs, or is the easiest way to generate them to trigger the Windows GPU Doc Gen CI Pipeline?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants