-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Look into using nvrtc for just-in-time cuda compilation #169
Labels
Comments
ddemidov
added a commit
that referenced
this issue
Mar 24, 2015
see #169 This mostly works, but fails some tests. * It looks like the compiler backend (or compiler settings) that is used in nvrtc is a bit different from the one in nvcc. Some kernels that nvcc compiles just fine are not accepted by nvrtc. The notable example is the use of anonymous structs in shared union (used in sort algorithms): union Shared { struct { int keys0[3072]; }; struct { float vals0[2816]; }; }; __shared__ union Shared shared; This results in the following compilation error when compiled with nvrtc: warning: declaration does not declare anything error: union "Shared" has no member "keys0" * The compilation of FFT tests just seems to hang. * And the caching does not work either with nvrtc, meaning that subsequent compiles take as much time as the first one.
Curious whether these compilation errors still exist with CUDA 9 or even CUDA 10? |
Yes, I am still getting the errors described in 35a9f30 with CUDA 9.1:
To test this: checkout branch nvrtc, do
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
NVRTC is a runtime compilation library for CUDA C++. It accepts CUDA C++ source code in character string form and creates handles that can be used to obtain the PTX:
http://docs.nvidia.com/cuda/nvrtc/index.html
The text was updated successfully, but these errors were encountered: