-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing 'mix' versions #65
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and apologies for the extremely long review time.
This is very cool and conceptually ready to merge, but I left a few suggestions how to improve particularly the build system integration.
In addition, I would like to ask that
- this is added to the Github build config for NVHPC, at least with the CUDA flavour, to make sure we test the build at least;
- these variants are described in the README
bundle.yml
Outdated
ENABLE_CLOUDSC_MIX=ON | ||
|
||
- cloudsc-gpu-offload : | ||
help : [OMP|ACC] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break ecbundle because pipe has a meaning in yaml, we need to make this valid:
help : [OMP|ACC] | |
help : "Data offload model for GPU variants. Available options: OMP, ACC" |
bundle.yml
Outdated
cmake : CLOUDSC_GPU_OFFLOAD={{value}} | ||
|
||
- cloudsc-gpu-lang : | ||
help : [CUDA|HIP|SYCL] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
help : [CUDA|HIP|SYCL] | |
help : "Kernel language for low-level GPU kernel implementations. Available options: CUDA, HIP, SYCL" |
src/common/CMakeLists.txt
Outdated
if ( ENABLE_CLOUDSC_MIX ) | ||
# if (CLOUDSC_GPU_OFFLOAD STREQUAL "ACC") | ||
# HACK: seems like nordc only necessary for NVIDIA machines but not AMD machines | ||
if (CLOUDSC_GPU_LANG STREQUAL "CUDA" OR CLOUDSC_GPU_LANG STREQUAL "SYCL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion for a shorter check
if (CLOUDSC_GPU_LANG STREQUAL "CUDA" OR CLOUDSC_GPU_LANG STREQUAL "SYCL") | |
if (CLOUDSC_GPU_LANG MATCHES "CUDA|SYCL") |
Also, that compiler option is NVHPC-specific, no? So, we might want to add CMAKE_Fortran_COMPILER_ID MATCHES "PGI|NVHPC"
src/cloudsc_mix/CMakeLists.txt
Outdated
# GPU data offload (default: ACC = 1) | ||
## ACC: 1 | ||
## OMP: 2 | ||
if (CLOUDSC_GPU_OFFLOAD STREQUAL "ACC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of named constants in the source is nice, it really helps the readability. But it isn't as clean here in the CMake layer, which I think we can improve with a little trick that requires you to also change only a single place when you ever want to change the numbering.
Conceptually it should look like this:
# Define integer IDs corresponding to every language choice
set(CUDA_LANG "1")
set(HIP_LANG "2")
set(SYCL_LANG "3")
set(ACC_OFFLOAD "4")
set(OMP_OFFLOAD "5")
# Select offload model
if( NOT DEFINED CLOUDSC_GPU_OFFLOAD )
set(CLOUDSC_GPU_OFFLOAD "ACC")
endif()
set(GPU_OFFLOAD ${${CLOUDSC_GPU_OFFLOAD}_OFFLOAD})
# Optional: check for valid options
if( NOT ${GPU_OFFLOAD} MATCHES "ACC|OMP" )
error(...)
endif()
# Select kernel language
if( NOT DEFINED CLOUDSC_GPU_LANG )
set(CLOUDSC_GPU_LANG "CUDA")
endif()
set(GPU_LANG ${${CLOUDSC_GPU_LANG}_LANG})
# Language and offload-specific overrides
if( CLOUDSC_GPU_LANG STREQUAL "CUDA" )
...
elseif(...)
...
endif()
[... Define library etc...]
# Provide definitions to target
foreach(_def CUDA_LANG HIP_LANG SYCL_LANG ACC_OFFLOAD OMP_OFFLOAD)
target_compile_definitions(dwarf-cloudsc-gpu-lib PUBLIC ${_def}=${${_def}})
endforeach()
This is untested but simplifies the logic in my opinion
src/cloudsc_mix/CMakeLists.txt
Outdated
${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES} | ||
) | ||
if (NOT DEFINED CMAKE_CUDA_ARCHITECTURES) | ||
target_compile_options(dwarf-cloudsc-gpu-lib PRIVATE $<$<COMPILE_LANGUAGE:CUDA>>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of the double-nested generator expression? $<$<...>>
#if GPU_LANG == CUDA_LANG | ||
#include "cuda.h" | ||
#elif GPU_LANG == HIP_LANG | ||
#include "hip/hip_runtime.h" | ||
#elif GPU_LANG == SYCL_LANG | ||
#include <CL/sycl.hpp> | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily required but just to point out that we could use here the same pattern that ectrans is using to switch between HIP and CUDA implementations via header files that rewrite relevant API calls on demand:
https://github.com/ecmwf-ifs/ectrans/blob/main/src/trans/gpu/algor/hicblas.h
With specific overloads for HIP and CUDA
src/common/CMakeLists.txt
Outdated
@@ -98,6 +108,8 @@ ecbuild_add_library( TARGET cloudsc-common-lib | |||
$<${HAVE_HDF5}:hdf5::hdf5_fortran> | |||
$<${HAVE_SERIALBOX}:Serialbox::Serialbox_Fortran> | |||
$<${HAVE_FIELD_API}:field_api_${prec}> | |||
DEFINITIONS | |||
ENABLE_CLOUDSC_MIX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentionally always active, or should this actually be something like $<${ENABLE_CLOUDSC_MIX}:ENABLE_CLOUDSC_MIX>
? (Just curious, because it would render the CPP guards in yoethf.F90, yomcst.F90 redundant)
… CUDA/HIP/SYCL kernel implementation
de38016
to
669eef4
Compare
Fortran driver + OpenMP/OpenACC offload + CUDA/HIP/SYCL kernel implementation
For now only SCC-K-CACHING.
./cloudsc-bundle build --with-mix --cloudsc-gpu-offload ACC|OMP --cloudsc-gpu-lang CUDA|HIP|SYCL
e.g.:
./cloudsc-bundle build --arch=arch/ecmwf/hpc2020/nvhpc/22.11 --with-mix --cloudsc-gpu-offload ACC --cloudsc-gpu-lang CUDA