very simple kernel fails to compile #2448

fwyzard · 2025-01-15T18:39:32Z

This embarrassingly simple program

#include <alpaka/alpaka.hpp>

using Idx = uint32_t;
using Dim1D = alpaka::DimInt<1u>;

class Kernel {
public:
  ALPAKA_FN_ACC void operator()(alpaka::AccGpuCudaRt<Dim1D, Idx> const &acc) const {
    alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
  }
};

fails to compile with clang 18.1, alpaka 1.2.0, boost 1.80 and cuda 12.4, when doing a host-only compilation while targetting the CUDA backend:

clang++ -std=c++20 -c \
  -Ipath/to/alpaka/include -Ipath/to/boost/include -Ipath/to/cuda/include \
  -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_ACC_GPU_CUDA_ONLY_MODE -DALPAKA_HOST_ONLY \
  -fdiagnostics-show-option -Wfatal-errors

gives

In file included from test.cc:1:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/alpaka.hpp:13:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/acc/AccCpuOmp2Blocks.hpp:16:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/idx/bt/IdxBtZero.hpp:10:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/vec/Vec.hpp:13:
In file included from /data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/dim/DimIntegralConst.hpp:7:
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/dim/Traits.hpp:19:5: fatal error: implicit instantiation of undefined template 'alpaka::trait::DimType<alpaka::gb::IdxGbUniformCudaHipBuiltIn<std::integral_constant<unsigned long, 1>, unsigned int>>'
   19 |     using Dim = typename trait::DimType<T>::type;
      |     ^
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/idx/Accessors.hpp:48:24: note: in instantiation of template type alias 'Dim' requested here
   48 |                 -> Vec<Dim<ImplementationBase>, Idx<ImplementationBase>>
      |                        ^
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/idx/Accessors.hpp:25:23: note: in instantiation of template class 'alpaka::trait::GetIdx<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1>, unsigned int>, alpaka::origin::Grid, alpaka::unit::Blocks>' requested here
   25 |         return trait::GetIdx<TIdx, TOrigin, TUnit>::getIdx(idx, workDiv);
      |                       ^
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/idx/Accessors.hpp:79:32: note: in instantiation of function template specialization 'alpaka::getIdx<alpaka::origin::Grid, alpaka::unit::Blocks, alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1>, unsigned int>, alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1>, unsigned int>>' requested here
   79 |                 return alpaka::getIdx<origin::Grid, unit::Blocks>(idx, workDiv)
      |                                ^
test.cc:9:13: note: in instantiation of function template specialization 'alpaka::getIdx<alpaka::origin::Grid, alpaka::unit::Threads, alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1>, unsigned int>>' requested here
    9 |     alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
      |             ^
/data/user/fwyzard/pr46916/CMSSW_15_0_X_2025-01-15-1100/alpaka/1.2.0/include/alpaka/dim/Traits.hpp:14:16: note: template is declared here
   14 |         struct DimType;
      |                ^
1 error generated.

The text was updated successfully, but these errors were encountered:

fwyzard · 2025-01-15T18:41:06Z

Any suggestions how to fix the alpaka 1.2.0 code to make it work ?

mehmetyusufoglu · 2025-01-16T13:53:56Z

I copied the code over an example and it is compiled with alpaka_ACC_GPU_CUDA_ENABLE and and alpaka_ACC_GPU_CUDA_ONLY cmake vars were ON with different compilers so issue is related to HOST_ONLY setting.

psychocoderHPC · 2025-01-17T15:36:07Z

The problem is most likely a missing include due to the usage of HOST_ONLY.
We can find all missing includes by using the header check of alpaka with the HOST_ONLY flag set.
We should run this in the CI too.

psychocoderHPC · 2025-01-20T13:12:35Z

OK the header check is not helping. The root of the problem is that ALPAKA_HOST_ONLY is disabling a lot of code that is necessary on the host side.
IMO the example above was never working with ALPAKA_HOST_ONLY.
I will have a look to see if the issue is fixable.

psychocoderHPC · 2025-01-20T13:35:45Z

OK IMO this issue is not a bug and the user code must be fixed.

From the PR @fwyzard implemented in the past #1567 the definition of ALPAKA_HOST_ONLY is:

If ALPAKA_HOST_ONLY is defined, a CUDA or HIP compiler is required only for compiling device code or kernel launches.
The rest of the CUDA or HIP host API (device queries, memory operations, etc.) can be used with a standard compiler, as long the required libraries are available.

Following this the example above must be changed to

#include <alpaka/alpaka.hpp>

using Idx = uint32_t;
using Dim1D = alpaka::DimInt<1u>;

#ifndef(ALPAKA_HOST_ONLY)
class Kernel {
public:
  ALPAKA_FN_ACC void operator()(alpaka::AccGpuCudaRt<Dim1D, Idx> const &acc) const {
    alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
  }
};
#endif

using concepts or as in the following example enabling if should work too but looks extremely boring:

#include <alpaka/alpaka.hpp>

#include <type_traits>

using Idx = uint32_t;
using Dim1D = alpaka::DimInt<1u>;

class Kernel
{
public:
    template<typename T_Acc>
    ALPAKA_FN_ACC auto operator()(T_Acc const& acc) const -> std::enable_if_t<
        std::is_same_v<alpaka::AccToTag<T_Acc>, alpaka::TagGpuCudaRt> && std::is_same_v<alpaka::Dim<T_Acc>, Dim1D>
        && std::is_same_v<alpaka::Idx<T_Acc>, Idx>>
    {
        alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc);
    }
};

fwyzard · 2025-01-20T15:43:36Z

@psychocoderHPC thanks for looking into this, I think you are right about the example being wrong.

I will check if the problem is in my small reproducer or in the original code.

fwyzard · 2025-01-21T06:05:57Z

@psychocoderHPC it is indeed our original code that is incorrect: the alpaka kernels are included in a file that should be "host-only", so the failure is indeed expected.

fwyzard · 2025-01-21T06:10:00Z

Do you think there may be a way to detect when it happens and issue a more explanatory error message ?

For example, if

ALPAKA_ACC_GPU_CUDA_ENABLED is defined
ALPAKA_ACC_GPU_CUDA_ONLY_MODE is defined (?)
ALPAKA_HOST_ONLY is defined
the compiler does not support CUDA device code (no nvcc and no clang++ -x cu)

then ALPAKA_FN_ACC could expand to something that prints a clear error ?

fwyzard · 2025-01-21T06:11:44Z

P.S. I have no idea why compiling with g++ instead of clang++ does not give any error messages.

psychocoderHPC added Type:Bug Backend:CUDA labels Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

very simple kernel fails to compile #2448

very simple kernel fails to compile #2448

fwyzard commented Jan 15, 2025

fwyzard commented Jan 15, 2025

mehmetyusufoglu commented Jan 16, 2025 •

edited

Loading

psychocoderHPC commented Jan 17, 2025

psychocoderHPC commented Jan 20, 2025

psychocoderHPC commented Jan 20, 2025

fwyzard commented Jan 20, 2025

fwyzard commented Jan 21, 2025

fwyzard commented Jan 21, 2025

fwyzard commented Jan 21, 2025

very simple kernel fails to compile #2448

very simple kernel fails to compile #2448

Comments

fwyzard commented Jan 15, 2025

fwyzard commented Jan 15, 2025

mehmetyusufoglu commented Jan 16, 2025 • edited Loading

psychocoderHPC commented Jan 17, 2025

psychocoderHPC commented Jan 20, 2025

psychocoderHPC commented Jan 20, 2025

fwyzard commented Jan 20, 2025

fwyzard commented Jan 21, 2025

fwyzard commented Jan 21, 2025

fwyzard commented Jan 21, 2025

mehmetyusufoglu commented Jan 16, 2025 •

edited

Loading