You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Division and modulo with runtime values is a very common operation. The nature of the operations and GPU architectures makes the computation inefficient for device code.
On the other hand, division and modulo can be easily precomputed in the host. This allows efficient code for device computation.
Describe the solution you'd like
Expose a fast_int class to support all common integer operations and is optimized for modulo and division.
Is this a duplicate?
Area
libcu++
Is your feature request related to a problem? Please describe.
Division and modulo with runtime values is a very common operation. The nature of the operations and GPU architectures makes the computation inefficient for device code.
On the other hand, division and modulo can be easily precomputed in the host. This allows efficient code for device computation.
Describe the solution you'd like
Expose a
fast_int
class to support all common integer operations and is optimized for modulo and division.The following references should be considered:
Describe alternatives you've considered
No response
Additional context
@sleeepyjack for visibility (cuCollection)
The text was updated successfully, but these errors were encountered: