You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NVidia GPUs can have vectorized instructions of length 2 or 4 . This is different from SIMT parallelism. The instruction will operate on 2(4) x 32 threads in a warp .
It looks like some old versions of CCE interprets simd as simt , and the value is required to use all threads in a warp.
That is not the case for new versions of cce . At the moment most compilers seem to ignore the simd directive.
What does #pragma omp simd actually do on the GPU ?
The text was updated successfully, but these errors were encountered: