Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very simple kernel fails to compile #2448

Open
fwyzard opened this issue Jan 15, 2025 · 9 comments
Open

very simple kernel fails to compile #2448

fwyzard opened this issue Jan 15, 2025 · 9 comments

Comments

@fwyzard
Copy link
Contributor

fwyzard commented Jan 15, 2025

This embarrassingly simple program

#include <alpaka/alpaka.hpp>

using Idx = uint32_t;
using Dim1D = alpaka::DimInt<1u>;

class Kernel {
public:
  ALPAKA_FN_ACC void operator()(alpaka::AccGpuCudaRt<Dim1D, Idx> const &acc) const {
    alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
  }
};

fails to compile with clang 18.1, alpaka 1.2.0, boost 1.80 and cuda 12.4, when doing a host-only compilation while targetting the CUDA backend:

clang++ -std=c++20 -c \
  -Ipath/to/alpaka/include -Ipath/to/boost/include -Ipath/to/cuda/include \
  -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_ACC_GPU_CUDA_ONLY_MODE -DALPAKA_HOST_ONLY \
  -fdiagnostics-show-option -Wfatal-errors

gives

In file included from test.cc:1:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/alpaka.hpp:13:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/acc/AccCpuOmp2Blocks.hpp:16:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/idx/bt/IdxBtZero.hpp:10:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/vec/Vec.hpp:13:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/dim/DimIntegralConst.hpp:7:
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/dim/Traits.hpp:19:5: fatal error: implicit instantiation of undefined template 'alpaka::trait::DimType<alpaka::gb::IdxGbUniformCudaHipBuiltIn<std::integral_constant<unsigned long, 1>, unsigned int>>'
   19 |     using Dim = typename trait::DimType<T>::type;
      |     ^
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/idx/Accessors.hpp:48:24: note: in instantiation of template type alias 'Dim' requested here
   48 |                 -> Vec<Dim<ImplementationBase>, Idx<ImplementationBase>>
      |                        ^
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/idx/Accessors.hpp:25:23: note: in instantiation of template class 'alpaka::trait::GetIdx<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1>, unsigned int>, alpaka::origin::Grid, alpaka::unit::Blocks>' requested here
   25 |         return trait::GetIdx<TIdx, TOrigin, TUnit>::getIdx(idx, workDiv);
      |                       ^
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/idx/Accessors.hpp:79:32: note: in instantiation of function template specialization 'alpaka::getIdx<alpaka::origin::Grid, alpaka::unit::Blocks, alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1>, unsigned int>, alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1>, unsigned int>>' requested here
   79 |                 return alpaka::getIdx<origin::Grid, unit::Blocks>(idx, workDiv)
      |                                ^
test.cc:9:13: note: in instantiation of function template specialization 'alpaka::getIdx<alpaka::origin::Grid, alpaka::unit::Threads, alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1>, unsigned int>>' requested here
    9 |     alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
      |             ^
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/dim/Traits.hpp:14:16: note: template is declared here
   14 |         struct DimType;
      |                ^
1 error generated.
@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 15, 2025

Any suggestions how to fix the alpaka 1.2.0 code to make it work ?

@mehmetyusufoglu
Copy link
Contributor

mehmetyusufoglu commented Jan 16, 2025

I copied the code over an example and it is compiled with alpaka_ACC_GPU_CUDA_ENABLE and and alpaka_ACC_GPU_CUDA_ONLY cmake vars were ON with different compilers so issue is related to HOST_ONLY setting.

@psychocoderHPC
Copy link
Member

The problem is most likely a missing include due to the usage of HOST_ONLY.
We can find all missing includes by using the header check of alpaka with the HOST_ONLY flag set.
We should run this in the CI too.

@psychocoderHPC
Copy link
Member

OK the header check is not helping. The root of the problem is that ALPAKA_HOST_ONLY is disabling a lot of code that is necessary on the host side.
IMO the example above was never working with ALPAKA_HOST_ONLY.
I will have a look to see if the issue is fixable.

@psychocoderHPC
Copy link
Member

OK IMO this issue is not a bug and the user code must be fixed.

From the PR @fwyzard implemented in the past #1567 the definition of ALPAKA_HOST_ONLY is:

If ALPAKA_HOST_ONLY is defined, a CUDA or HIP compiler is required only for compiling device code or kernel launches.
The rest of the CUDA or HIP host API (device queries, memory operations, etc.) can be used with a standard compiler, as long the required libraries are available.

Following this the example above must be changed to

#include <alpaka/alpaka.hpp>

using Idx = uint32_t;
using Dim1D = alpaka::DimInt<1u>;

#ifndef(ALPAKA_HOST_ONLY)
class Kernel {
public:
  ALPAKA_FN_ACC void operator()(alpaka::AccGpuCudaRt<Dim1D, Idx> const &acc) const {
    alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
  }
};
#endif

using concepts or as in the following example enabling if should work too but looks extremely boring:

#include <alpaka/alpaka.hpp>

#include <type_traits>

using Idx = uint32_t;
using Dim1D = alpaka::DimInt<1u>;

class Kernel
{
public:
    template<typename T_Acc>
    ALPAKA_FN_ACC auto operator()(T_Acc const& acc) const -> std::enable_if_t<
        std::is_same_v<alpaka::AccToTag<T_Acc>, alpaka::TagGpuCudaRt> && std::is_same_v<alpaka::Dim<T_Acc>, Dim1D>
        && std::is_same_v<alpaka::Idx<T_Acc>, Idx>>
    {
        alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
    }
};

@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 20, 2025

@psychocoderHPC thanks for looking into this, I think you are right about the example being wrong.

I will check if the problem is in my small reproducer or in the original code.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 21, 2025

@psychocoderHPC it is indeed our original code that is incorrect: the alpaka kernels are included in a file that should be "host-only", so the failure is indeed expected.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 21, 2025

Do you think there may be a way to detect when it happens and issue a more explanatory error message ?

For example, if

  • ALPAKA_ACC_GPU_CUDA_ENABLED is defined
  • ALPAKA_ACC_GPU_CUDA_ONLY_MODE is defined (?)
  • ALPAKA_HOST_ONLY is defined
  • the compiler does not support CUDA device code (no nvcc and no clang++ -x cu)

then ALPAKA_FN_ACC could expand to something that prints a clear error ?

@fwyzard
Copy link
Contributor Author

fwyzard commented Jan 21, 2025

P.S. I have no idea why compiling with g++ instead of clang++ does not give any error messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants