-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transpile: Split C-kernel generation from ISO-C wrapper generation #464
Conversation
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/464/index.html |
a37fed2
to
f75c1d0
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #464 +/- ##
==========================================
+ Coverage 93.29% 93.35% +0.06%
==========================================
Files 221 223 +2
Lines 41360 41428 +68
==========================================
+ Hits 38587 38676 +89
+ Misses 2773 2752 -21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
f75c1d0
to
dd2c558
Compare
This allows us to separate the header generation pass from the actual kernel transformation, as these might have different traversal mechanics in the scheduler.
dd2c558
to
ab071e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thank you.
Two small changes requested otherwise happy with this PR.
Also tested with/for CLOUDSC using the branch nams-integrate-loki-cuda
and the merged this branch/PR into the Loki branch nams-integrate-cloudsc-loki-cuda
to test the C and CUDA transpilation with CLOUDSC.
@@ -96,8 +94,7 @@ def visit_CallStatement(self, o, **kwargs): | |||
|
|||
class FortranCTransformation(Transformation): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both use_c_ptr
and path
should be removed from the doctstring.
scripts/loki_transform.py
Outdated
elif mode in ['cuda_parametrise', 'cuda_hoist']: | ||
f2c_transformation = FortranCTransformation(path=build, language='cuda', use_c_ptr=True) | ||
f2c_transformation = FortranCTransformation(language='cuda', use_c_ptr=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f2c_transformation = FortranCTransformation(language='cuda', use_c_ptr=True) | |
f2c_transformation = FortranCTransformation(language='cuda') |
There is no use_c_ptr
argument for FortranCTransformation
anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks great and is an important stepping stone towards following through with some of the deprecations.
I left a few observation comments but none of these are strictly required.
def file_suffix(self): | ||
if self.language == 'cpp': | ||
return '.cpp' | ||
return '.c' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, looks like we only check the cppcodegen
in the transpile tests. Now that the wrapper generation is separated this gets exposed. Admittedly, I think we'll need to rethink how these wrappers are generally done some time soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think the wrapper transformation doesn't write any c/cpp files at all, so I meant this entire method is redundant here. The same exists in the actual FortranCTransformation, where this is indeed used to write the transpiled code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh, right, yes that's an oversight then. I've removed the redundant one now. Good catch!
for name, td in module.typedef_map.items(): | ||
c_structs[name.lower()] = c_struct_typedef(td, use_c_ptr=self.use_c_ptr) | ||
|
||
if role == 'header': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nitpicking here, but this could come first to bail-out early (e.g. save the c_struct_typedef calls)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, these are even redundant, as they don't get propagated. I've fix this here and in the module entry point below now.
def c_intrinsic_kind(_type, scope): | ||
if _type.dtype == BasicType.LOGICAL: | ||
return sym.Variable(name='int', scope=scope) | ||
if _type.dtype == BasicType.INTEGER: | ||
return sym.Variable(name='int', scope=scope) | ||
if _type.dtype == BasicType.REAL: | ||
kind = str(_type.kind) | ||
if kind.lower() in ('real32', 'c_float'): | ||
return sym.Variable(name='float', scope=scope) | ||
if kind.lower() in ('real64', 'jprb', 'selected_real_kind(13, 300)', 'c_double'): | ||
return sym.Variable(name='double', scope=scope) | ||
return None | ||
|
||
|
||
def iso_c_intrinsic_import(scope, use_c_ptr=False): | ||
import_symbols = ['c_int', 'c_double', 'c_float'] | ||
if use_c_ptr: | ||
import_symbols += ['c_ptr', 'c_loc'] | ||
symbols = as_tuple(sym.Variable(name=name, scope=scope) for name in import_symbols) | ||
isoc_import = ir.Import(module='iso_c_binding', symbols=symbols) | ||
return isoc_import | ||
|
||
|
||
def iso_c_intrinsic_kind(_type, scope, is_array=False, use_c_ptr=False): | ||
if _type.dtype == BasicType.INTEGER: | ||
return sym.Variable(name='c_int', scope=scope) | ||
|
||
if _type.dtype == BasicType.REAL: | ||
kind = str(_type.kind) | ||
if kind.lower() in ('real32', 'c_float'): | ||
return sym.Variable(name='c_float', scope=scope) | ||
if kind.lower() in ('real64', 'jprb', 'selected_real_kind(13, 300)', 'c_double', 'c_ptr'): | ||
if use_c_ptr and is_array: | ||
return sym.Variable(name='c_ptr', scope=scope) | ||
return sym.Variable(name='c_double', scope=scope) | ||
|
||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pointing out that these three routines don't have docstrings 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to retro-fit these, please have a look.
for fct in module.subroutines: | ||
if fct.is_function: | ||
intf_fct = fct.clone(bind=f'{fct.name.lower()}') | ||
intf_fct.body = ir.Section(body=()) | ||
|
||
intf_args = [] | ||
for arg in intf_fct.arguments: | ||
# Only scalar, intent(in) arguments are pass by value | ||
# Pass by reference for array types | ||
value = isinstance(arg, sym.Scalar) and arg.type.intent and arg.type.intent.lower() == 'in' | ||
kind = iso_c_intrinsic_kind(arg.type, intf_fct, use_c_ptr=use_c_ptr) | ||
ctype = SymbolAttributes(arg.type.dtype, value=value, kind=kind) | ||
dimensions = arg.dimensions if isinstance(arg, sym.Array) else None | ||
var = sym.Variable(name=arg.name, dimensions=dimensions, type=ctype, scope=intf_fct) | ||
intf_args += (var,) | ||
intf_fct.arguments = intf_args | ||
sanitise_imports(intf_fct) | ||
intfs.append(intf_fct) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[no action required] Codecov points out that this is unused. We should probably add a test for module routine transpilation in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this one is puzzling me a bit. I've tried triggering this, but I think this is some leftover from the previous way we dealt with stmt funcs. Not quite sure what to do about it, but I think we want to carefully rethink some of these wrappers anyway, so we can mop this all up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had to try this myself. The relevant code path is hit with a test like this:
@pytest.mark.parametrize('frontend', available_frontends())
@pytest.mark.parametrize('language', ['c', 'cpp'])
def test_transpile_module_function(tmp_path, builder, frontend, language):
fcode_mod = """
module my_mod
use iso_fortran_env, only: real64
implicit none
contains
function scalar_func(a, b)
real(kind=real64), intent(in) :: a, b
real(kind=real64) :: scalar_func
scalar_func = a + b
end function scalar_func
function array_func(a, b, n)
real(kind=real64), intent(in) :: a(n), b(n)
integer, intent(in) :: n
real(kind=real64) :: array_func(n)
array_func(:) = a(:) + b(:)
end function array_func
end module my_mod
""".strip()
fcode = """
subroutine driver(a, b, c, d, n)
use my_mod, only: scalar_func, array_func
use iso_fortran_env, only: real64
implicit none
real(kind=real64), intent(in) :: a(n), b(n)
real(kind=real64), intent(inout) :: c(n), d(n)
integer, intent(in) :: n
integer :: j
do j=1,n
c(j) = scalar_func(a(j), b(j))
enddo
d = array_func(a, b, n)
end subroutine driver
""".strip()
module = Module.from_source(fcode_mod, xmods=[tmp_path], frontend=frontend)
routine = Subroutine.from_source(fcode, xmods=[tmp_path], frontend=frontend, definitions=[module])
refname = f'ref_{routine.name}_{frontend}'
reference = jit_compile_lib([module, routine], path=tmp_path, name=refname, builder=builder)
n = 10
a = np.ones(shape=(n,), order='F', dtype=np.float64)
b = np.ones(shape=(n,), order='F', dtype=np.float64)
c = np.zeros(shape=(n,), order='F', dtype=np.float64)
d = np.zeros(shape=(n,), order='F', dtype=np.float64)
cd_ref = a + b
reference.driver(a=a, b=b, c=c, d=d, n=n)
assert np.all(c == cd_ref)
assert np.all(d == cd_ref)
f2cwrap = FortranISOCWrapperTransformation(language=language)
f2c = FortranCTransformation()
for trafo in [f2c, f2cwrap]:
for source in [routine, module, *module.subroutines]:
trafo.apply(source=source, path=tmp_path, role='kernel')
f2cwrap.apply(source=module, path=tmp_path, role='header')
But I don't quite know how to complete the test here (i.e., what to check for). Should we put this in an issue for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, as discussed offline, this can then probably be removed. I'm fairly confident that the use case for this has been removed from CLOUDSC, and the approach is really quite hacky. If CLOUDSC regression is fine after the removal I think this is safe.
9b3e114
to
a8da7d6
Compare
a8da7d6
to
9d07cfc
Compare
Very nice, many thanks! |
Note: Apologies for the size of this PR; it's mainly housekeeping. If this is too much, please let me know, as it can be split into more digestible chunks. I promise I've not changed any meaningful logic beyond what's mentioned below.
This PR primarily aims at separating the two stages of C-style language transpilation:
Subroutine
Subroutine
and usedModule
, including the generation of associated C-style header files.The key change to achieve this is the creation of a new sub-package
loki.transforamtion.transpile.fortran_iso_c_wrapper
with an additional transformationFortranISOCWrapperTransformation
. This can now be used separately in scheduler pipelines and may have a different iteration space, as it may needs to includeModule
objects for type information.In addition, both Transformations can now use the
build_args
handed over by the scheduler, if no explicitpath
argument is given. I've kept the explicit override, as it is used extensively throughout the test base. This will eventually allow us to properly express C-transpilation as config-based pipelines. In future changes, we can hopefully further unpick the C-transpilation to build specific C/CUDA/HIP pipelines that can re-use the ISO-C wrapper generation.In some more detail:
build_args
for C-transpile transformations, when used with the scheduler.fortran_iso_c_wrapper
sub-package for transpile, including the respective utilities for dering with ISO-C types, derived-types and structsFortranISOCWrapperTransformation
to execute wrapper and header generation for subroutines and modules