Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA]: Expose a fast modulo division algorithm for device code #3442

Open
1 task done
fbusato opened this issue Jan 18, 2025 · 0 comments
Open
1 task done

[FEA]: Expose a fast modulo division algorithm for device code #3442

fbusato opened this issue Jan 18, 2025 · 0 comments
Labels
feature request New feature or request.

Comments

@fbusato
Copy link
Contributor

fbusato commented Jan 18, 2025

Is this a duplicate?

Area

libcu++

Is your feature request related to a problem? Please describe.

Division and modulo with runtime values is a very common operation. The nature of the operations and GPU architectures makes the computation inefficient for device code.
On the other hand, division and modulo can be easily precomputed in the host. This allows efficient code for device computation.

Describe the solution you'd like

Expose a fast_int class to support all common integer operations and is optimized for modulo and division.

The following references should be considered:

Describe alternatives you've considered

No response

Additional context

@sleeepyjack for visibility (cuCollection)

@fbusato fbusato added the feature request New feature or request. label Jan 18, 2025
@github-project-automation github-project-automation bot moved this to Todo in CCCL Jan 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request.
Projects
Status: Todo
Development

No branches or pull requests

1 participant