Skip to content

Releases: ecmwf-ifs/loki

Version 0.3.0

20 Jan 08:45
Compare
Choose a tag to compare

This Loki release introduces a number of breaking changes by removing features that have previously been marked as deprecated.

Note that version tags from 0.3.0 onwards drop the v prefix to align with the convention in other ECMWF software projects, i.e., this version tag is 0.3.0 while the previous release was v0.2.10.

Most importantly, this extends to the custom entry points in the loki-transform.py command line interface: these have been removed and the use of a TOML config file with transformation configuration and pipeline definition is mandatory now. See, for example, CLOUDSC's cloudsc_loki.config as a fully-featured example that specifies file-specific frontend overrides, several transformation options, and multiple bespoke transformation pipelines. The CMake functions providing the build-system integration have been adapted accordingly by removing previously supported options - you should experience build-time failures during the CMake configuration step of your project if you were still relying on any of these features (see #465).

Due to its incomplete, slow, and partially defective parsing of Fortran files, the previously deprecated and no longer tested frontend Open Fortran Parser (OFP) was removed. Please use the Fparser2 frontend (recommended) or alternatively the recently updated OMNI frontend (see #469).

This release also removes the CLAW transformations. CLAW is no longer maintained and Loki-internal transformations for single-column code produce superior performance on GPUs (see #477).

Finally, the switch to a purely config-based and therefore pipeline-driven batch processing command-line interface paved the way for improvements in the plan processing mode for build system integration: This feature derives the lists of files to add and remove to a build target and the previous heuristic purely based on the initial dependency graph is now replaced by a dry-run of the transformation pipeline. Transformations that modify the dependency graph by adding or removing nodes can now encode this behaviour in a new plan_* entry point that will be invoked during the dry-run (see #462).

What's Changed

  • Inline: Skip explicit intrinsics when inlining stmt functions by @mlange05 in #461
  • Expression: Remove logical literals in and/or operations during simplify by @mlange05 in #467
  • Inline: Avoid Import duplication from multiple callees by @mlange05 in #468
  • Transpile: Split C-kernel generation from ISO-C wrapper generation by @mlange05 in #464
  • Pipeline-plan duplicate/remove transformation changing dependencies by @MichaelSt98 in #462
  • Frontend: Final removal of OFP frontend by @mlange05 in #469
  • Sanitise: Fix rescoping of symbols for nested associates by @mlange05 in #470
  • Fix dependency on header files in loki_transform_target by @reuterbal in #471
  • DataOffload: Field offload of driver regions by @mlange05 in #457
  • Transformations: Fix downcasing of nested InlineCall symbols by @mlange05 in #472
  • Loki-transform: Don't trigger a full parse for plan mode by @mlange05 in #473
  • IR: Better tuple autocasting via pydantic field_validators by @mlange05 in #476
  • Loki-transform: Remove custom entry point options by @mlange05 in #465
  • FieldOffload: Allow and skip non-enriched calls in offload region by @mlange05 in #478
  • Loki: Remove the CLAW and its associated transformations by @mlange05 in #477

Full Changelog: v0.2.10...0.3.0

v0.2.10

16 Dec 15:07
1e4d3e4
Compare
Choose a tag to compare

This is mostly a maintenance release with a set of minor bugfixes and improvements.

Included is a conceptual change to the CMake plan mode in preparation for future improvements: A new plan mode has been added to Transformation objects, which allows to encode dependency graph changes and other, build-system relevant transformation steps in a dry-run mode. Instead of deriving the CMake plan heuristically from the Scheduler graph, the plan mode will now perform a dry-run of the entire pipeline.
As a consequence, plan mode is now no longer supported without a pipeline definition in the config file. Usage without a pipeline definition in a config file has been deprecated since v0.2.8.

What's Changed

Full Changelog: v0.2.9...v0.2.10

v0.2.9

25 Nov 20:32
Compare
Choose a tag to compare

What's new

  • A new FieldOffloadTransformation was added to inject FIELD API boilerplate code to driver routine (#437)
  • do_unroll_loop supports negative loop bounds (#443)
  • A number of bugfixes

All changes

  • Loki: Pin pymbolic version to 2022.2 due to upstream incompatibility by @mlange05 in #434
  • Sanitise: New transformation sub-package and some refactoring by @mlange05 in #433
  • Transformations: "Parallel" sub-module with driver-level parallelilsation utilities by @mlange05 in #415
  • JoinableStringList: Do not break lines within quoted strings by @reuterbal in #440
  • Sanitise: Update scope on unchanged expressions in remove_associates by @mlange05 in #439
  • Continued: F2C/CUDA transpilation by @MichaelSt98 in #424
  • DataOffload: Add new FieldOffloadTransformation for FIELD API boilerplate injection by @wertysas in #437
  • Expressions: Handle intrinsic function calls by @mlange05 in #416
  • Inline: Fix rescoping of intrinsic procedure symbols in elementals by @mlange05 in #445
  • DataOffload: fix generation of offload pragmas by @awnawab in #442
  • Unroll negative loop bounds and retain pragmas inside unrolled loop body by @awnawab in #443
  • Offload: Refactor loki.transformations.data_offload into separate sub-package by @mlange05 in #446
  • Loki: Turn test sub-directories into sub-pacakges by @mlange05 in #447
  • Module: Fix enrichment of type info via Module imports by @mlange05 in #448
  • CMake plan and enrichment bugfix by @reuterbal in #441
  • Fix Linter warning by @reuterbal in #451
  • [F2C transpilation] (driver level) convert interface to import by @MichaelSt98 in #422

Full Changelog: v0.2.8...v0.2.9

v0.2.8

11 Nov 14:10
Compare
Choose a tag to compare

This release consists of fixes, refactoring, additions and deprecates a number of outdated or redundant features and APIs.

Deprecations

  • The Open Fortran Parser frontend is no longer supported by Loki. It is still available in this release but its use will print deprecation warnings and it is no longer tested in the CI. OFP will be removed from Loki in the next release (see #411 and #406)
  • The CLAW compiler is no longer maintained. In order to use the OMNI frontend, the recommended procedure is no longer to install CLAW but to use the latest OMNI compiler frontend directly, e.g., via the --with-omni flag in the install script (see #408 and #406)
  • The ability to use loki-transform.py without a config file is deprecated and will be removed in the next release. The config file is far superior when parameterising transformation pipelines and the config file can easily be versioned together with the code it is meant to transform. See the CLOUDSC config file for an example (#429)
  • The Maxeler transpilation module has been removed (#405)

What's new

  • The handling of typed symbols for expressions moves closer to their corresponding scopes, with a new convenience API introduced in #375
  • The CLOUDSC2 mini-app, a simplified cloud microphysics scheme with tangent-linear and adjoint code paths is now part of the regression test suite (#230)
  • A new vertical loop fusion transformation has been added that is guided via in-source annotations (#374)
  • f90wrap has been updated to 0.2.15+, which restores compatibility with Numpy 2.0+ (#407)
  • Pragma-guided high-order loop transformations, such as fusion, fission, interchange or unrolling, can now be triggered via a single transformation (#430)

All changes

  • IR: Move expr_visitors to loki.ir by @mlange05 in #372
  • Improve on multiconditionals/switch/select case by @MichaelSt98 in #384
  • Transpilation: optional arguments by @MichaelSt98 in #385
  • Fix edge case for vector section mapping by @MichaelSt98 in #382
  • IR: Symbol management on scoped nodes by @mlange05 in #375
  • Extend 'resolve_vector_notation' to look for available and appropriate loops by @MichaelSt98 in #386
  • Fix representation of array return type in OMNI frontend by @reuterbal in #391
  • SingleColumn: Fix vectorisation of nested else-if bodies by @mlange05 in #392
  • Inline functions (including multi-line and non-elemental functions) by @MichaelSt98 in #378
  • handle modulo operator/function for c-like-backends by @MichaelSt98 in #383
  • Transformations: ResolveAssociateTransformer re-write to in-place substitution by @mlange05 in #387
  • Transformations: Remove routine pragmas when inlining functions by @mlange05 in #395
  • Utility to remove duplicate arguments for calls and callees by @MichaelSt98 in #367
  • Pytest CLI option for log-level by @reuterbal in #396
  • Logging: Small log-level sanitisation and CLI flags by @mlange05 in #394
  • Transformations: Re-organise inline and extract sub-packages by @mlange05 in #376
  • Vertical loop fusion and demotion of temporaries by @MichaelSt98 in #374
  • Skip privatization of arrays with existing data declarations by @awnawab in #389
  • Update f90wrap to 0.2.15 as minimum to ensure compatibility with numpy 2.0+ by @reuterbal in #407
  • Remove Maxeler transpilation module by @reuterbal in #405
  • Fix Scheduler instantiation without config (fix #373) by @reuterbal in #403
  • Fix SccAnnotate when existing acc pragmas declare a copy category more than once by @reuterbal in #409
  • Improve representation of procedure pointers (fix #393) by @reuterbal in #399
  • Add option to install "plain OMNI" to install script and upgrade Github actions runners by @reuterbal in #408
  • Fix function inlining when only interface is available (fixes #397) by @reuterbal in #402
  • Regression test for CLOUDSC2 by @reuterbal in #230
  • Remove duplicate declarations for external statements (fix #57) by @reuterbal in #404
  • Utilities to merge associate blocks and restrict depth of associate resolution by @mlange05 in #388
  • Prevent superfluous clone of loki in ecwam regression test by @awnawab in #410
  • Inline elemental functions: skip calls with args being array (slices) by @MichaelSt98 in #401
  • Frontend: Deprecate OFP and purge from test base by @mlange05 in #411
  • CMake/python_venv: Do not request COMPONENT Development by @reuterbal in #413
  • Dimension: Support stepping, implicit aliases and remove contrainsts by @mlange05 in #414
  • Handle Loki dimension pragmas for modules (and not only routines) for FP by @MichaelSt98 in #417
  • Allow for optional case-sensitive 'recursive_expression_map_update' by @MichaelSt98 in #418
  • extend 'remove_explicit_array_dimensions' by @MichaelSt98 in #421
  • C-like-backends: skip/don't write Fortran interfaces by @MichaelSt98 in #423
  • Make builddir a runtime argument of FileWriteTransformation by @awnawab in #425
  • Loki-transform: Add deprecation message about custom entry points by @mlange05 in #429
  • Expression: Expression cloning and mapper tests by @mlange05 in #419
  • Extract: Improved region-outlining for complex procedures by @mlange05 in #412
  • IR: Fix false "end" matches in pragma_regions_attached utility by @mlange05 in #431
  • Transformation to call loop transform utilities by @awnawab in #430
  • Bump version number to 0.2.8 by @reuterbal in #432

Full Changelog: v0.2.7...v0.2.8

v0.2.7

04 Oct 13:07
477c56d
Compare
Choose a tag to compare

What's New

  • Experimental Fortran-to-CUDA transpilation demonstrated on CLOUDSC (#328)
  • A new SplitReadWriteTransformation that allows user-guided GPU optimisation to make loads independent from stores (#329)
  • A new LowerConstantArrayIndices transformation to pass full arrays instead of constant slices in kernel calls (#348)
  • New transformation utilities to introduce loop blocking for driver loops (#362)
  • A new string-based substitution mechanism for expressions (#366)
  • Refactoring of SCC tests (#353) and transformation utilities (#354)
  • And many small improvements and bug fixes (see below)

All Changes

  • IR: Automatic sanitisation of tuples in IR constructors by @mlange05 in #350
  • Run pytest on macos in GH actions by @reuterbal in #262
  • SCC test reshuffle by @mlange05 in #353
  • Transformations: Move common SCC utility routines to utilities by @mlange05 in #354
  • Transformations: Test and fix corner case in get_local_arrays by @mlange05 in #355
  • Tools: Disable timeout utility test on MacOS due to sporadic failures by @mlange05 in #356
  • Fixed logical evaluation of PRESENT intrinsics on Array variables by @JoeffreyLegaux in #341
  • ecWAM regression tests: switch to develop-1.3 branch by @awnawab in #358
  • Split reads and writes for certain accumulation patterns by @awnawab in #329
  • fix for 'resolve_vector_notation' utility by @MichaelSt98 in #361
  • Transformations: Internalise IdemTransformation by @mlange05 in #360
  • New transformation 'LowerConstantArrayIndices' to allow to … by @MichaelSt98 in #348
  • OMNI: Fix dimension range-indexing in frontend by @mlange05 in #363
  • Loki-transform: Pass cuf option to FilewriteTrafo by @mlange05 in #364
  • Filter out globals in get_local_arrays by @awnawab in #370
  • extend hoist variables functionality by @MichaelSt98 in #357
  • Change/fix pipeline for mode 'scc-raw-stack' by @MichaelSt98 in #371
  • Minimal padding in pool allocator by @awnawab in #365
  • CLOUDSC low-level GPU (transpilation) via Loki (CUF/CUDA) by @MichaelSt98 in #328
  • Loop splitting/blocking of block loops by @wertysas in #362
  • String-based expression substitution and moar expression tests! by @mlange05 in #366
  • SCC: Add vectorisation annotations in SCCRevector and translate in SCCAnnotate by @mlange05 in #359
  • Update VERSION to 0.2.7 by @reuterbal in #381

New Contributors

Full Changelog: v0.2.6...v0.2.7

v0.2.6

26 Jul 13:09
190bdfa
Compare
Choose a tag to compare

This is a minor release with a number of housekeeping changes and some new features.

What's new

  • We had a dependency on the Pydantic 1.x releases until now, and this release adds support for Pydantic 2. The next release will require Pydantic 2. (#349)
  • The InlineTransformation allows now to inline statement functions (#345)
  • A new LoopUnrollTransformation allows to explicitly unroll pragma-annotated loops (#347)
  • Loki IR has now support for the FORALL statement and construct. However, this feature is only fully supported with the Fparser2 frontend (#210)
  • Cray pointers are now represented in the Loki IR as Intrinsic nodes (#342)
  • Python package installation works now correctly also from tarballs and other non-git versioned installation sources (#344)
  • The test base has been cleaned up: all regression tests use now publicly available source branches, and all tests should now create temporary files in test-local temporary directories to avoid littering the source tree (#335, #343)

All changes

New Contributors

Full Changelog: v0.2.5...v0.2.6

v0.2.5

24 Jun 08:15
a39336b
Compare
Choose a tag to compare

A minor release adding new transformations and fixing issues in the frontends, handling of derived types, dataflow analysis and transformations.

What's New

  • A general BlockIndexInjectTransformation that injects the block-index into all array subscripts that have a local rank one less than their declared rank (#303)
  • A corresponding, IFS-specific BlockViewToFieldViewTransformation to replace per-block view pointers with full field pointers (#303)
  • A new SCCRawStackPipeline that uses a pool-allocator variant where each use of temporaries is replaced with fixed offsets into a pre-allocated scratch memory (#314, incorporating #201 by @rolfhm)

All Changes

  • Block-index injection transformations by @awnawab in #303
  • Fix parse failures with REGEX frontend due to white space in declarations by @reuterbal in #323
  • DataFlowAnalysis bug fixes by @awnawab in #320
  • Fix derived type inheritance when parent type is not available (#330) by @reuterbal in #331
  • InlineTransformation: Update Scheduler SGraph if marked_inline is activated by @awnawab in #322
  • HoistVariablesAnalysis: remove unused explicit interfaces after inlining by @awnawab in #319
  • Fix Linter warnings for inline calls with interface block imported from header with func.h suffix by @reuterbal in #332
  • Add transformation generated imports to driver or after inlining by @awnawab in #321
  • Fix wrong classification as StatementFunction in translation to Loki IR by @reuterbal in #327
  • get_pragma_parameters: Fix parsing clauses without parentheses in the tail string by @reuterbal in #324
  • ProgramUnit.resolve_typebound_var: raise error if top-level parent is not declared by @reuterbal in #325
  • Transformations: SCCRawStackPipeline and SCC config-from-file by @mlange05 in #314

Full Changelog: v0.2.4...v0.2.5

v0.2.4

28 May 12:35
f3e7d90
Compare
Choose a tag to compare

This is a minor maintenance release matching the declaration of Hybrid 2024 Milestone 1.

What's Changed

  • Repo reorganisation: Moving transformations by @mlange05 in #296
  • Fix: import of private symbols affects the type inference by @quepas in #308
  • JIT compilation updates and compatibility with f90wrap v0.2.14 by @reuterbal in #315
  • IR: Fix get_pragma_params for multiline pragmas by @mlange05 in #313
  • Transformations: Remap declaration symbols and adjust imports when inlining by @mlange05 in #311
  • Docs: Update to links from static doc pages by @mlange05 in #312

New Contributors

Full Changelog: v0.2.3...v0.2.4

v0.2.3

30 Apr 15:48
102dafd
Compare
Choose a tag to compare

This is a minor bugfix/maintenance release to resolve some issues around the Loki installation and version number discovery, particularly when installing from a code version that is not under Git version control.

What's Changed

Full Changelog: v0.2.2...v0.2.3

v0.2.2

26 Apr 07:02
5673795
Compare
Choose a tag to compare

This is a feature and bugfix release, which adds new functionality and resolves a number of problems.

What's New

  • Loki supports a new, streamlined way of composing transformation pipelines from individual Transformation classes. Transformation arguments are shared among transformations, ensuring consistency, e.g., for Dimension parameters. Pipelines and transformation arguments can even be constructed purely from the config file, which will become the default for the loki-transform.py convert command in the future. See #217 for more details on how this works.
  • The pool allocator transformation has a new option to improve compatibility with Cray Compiler Environment 16 on AMD platforms. For that, the pointer arithmetic is removed and LOC calls are used directly in the kernel to determine the offset of a temporary in the scratch allocation. See #231 for more details.
  • A new RemoveCodeTransformation has been added, replacing the RemoveCallsTransformation and incorporating the dead code removal. Additionally, it provides a new feature to remove pragma-annotated code sections via !$loki remove / !$loki end remove (#276).
  • Loki's JIT functionality that is used to build and run tests has been amended so that it honours environment variables and no longer depends on gfortran exclusively. Instead, environment variables CC, FC, F90, and LD are inspected to determine the compile commands to use, and CFLAGS, FCFLAGS, F90FLAGS, and LDFLAGS can be used to set corresponding flags. Default values are provided for GNU and NVHPC compilers. With this, it is now possible to run the test suite also on MacOS after installing gcc and gfortran (e.g., via Homebrew), and setting the environment variables accordingly. Note that Numpy's F2PY, which is used to call Fortran routines from the Python test base, works also with non-GNU compilers (e.g., NVHPC) but requires gcc to compile the C interface routines. Also, not all tests are compatible with NVHPC and test failures are a known issue that will be resolved in the future (#301). See #294 for more details.
  • The parse_expr utility's functionality has been expanded to support derived types and underpins now the get_pragma_parameters utility, providing a vastly expanded functionality for expressions in pragma annotations (#292).

What's Changed

  • [CMake] Expose GLOBAL_VAR_OFFLOAD and INCLUDES in loki_transform_target by @awnawab in #264
  • Preserve imported statement functions by @awnawab in #251
  • Fix codecov by adding CODECOV_TOKEN by @reuterbal in #278
  • cgen: multiconditional/switch/select case statement by @MichaelSt98 in #267
  • Introducing the Pipeline class by @mlange05 in #260
  • Alternative stack/pool allocator implementation based on Cray pointers compatible with Cray+AMD stack by @MichaelSt98 in #231
  • improved replace_intrinsics and added rename_variables by @MichaelSt98 in #266
  • Revert "DEPENDENCY TRAFO: statement functions included via c-style imports preserved" (#251) by @reuterbal in #282
  • cgen: return type and var for function(s) by @MichaelSt98 in #269
  • Pipeline configuration from file by @mlange05 in #271
  • Fixing nested associate scope-parentage tracking after inlining by @mlange05 in #281
  • F2C: DeReferenceTrafo by @MichaelSt98 in #273
  • REGEX frontend: white space and nesting bugfix by @reuterbal in #274
  • Preserve import statement functions - take II by @awnawab in #283
  • Skip driver routine in GlobalVariableAnalysis by @awnawab in #265
  • MaskedTransformer: Fix in-place rebuilding of scoped nodes by @mlange05 in #284
  • Avoid variable_map in TypedSymbol.get_derived_type_member and verify type information is derived correctly by @reuterbal in #285
  • SCCHoist: hoist inline call temporaries and don't hoist statically declared arrays by @awnawab in #268
  • Pool allocator: correctly resolve derived type member as block dimension and ignore pointer/allocatable arrays by @awnawab in #249
  • Marked region removal and general code removal transformation by @mlange05 in #276
  • SCC: make vertical dimension optional by @awnawab in #270
  • SCCBaseTransformation.get_integer_variable now also checks module imports by @awnawab in #279
  • Improve performance of pragma-region attach/detach by using transformers by @mlange05 in #286
  • Reorganising test directories by @mlange05 in #287
  • [Bugfix] available_frontends: Import pytest locally to make dependency optional by @reuterbal in #290
  • DataflowAnalysis bugfix: preserve body nesting in visit_MaskedStatement by @awnawab in #288
  • Loki expression parser based on pymbolic parser by @MichaelSt98 in #272
  • F2C: optional case-sensitivity for variables/symbols by @MichaelSt98 in #277
  • Transformation to hoist temporaries in kernel language transpilation by @MichaelSt98 in #291
  • fix scoping for global var hoisting by @MichaelSt98 in #293
  • SCC: Support for bounds aliases and derived type members as bounds by @awnawab in #250
  • Consistent, environment-configurable use of Compiler class in JIT compilation by @reuterbal in #294
  • Derived-type inheritance by @awnawab in #295
  • Improve parse_expr and use in process_dimension_pragmas by @MichaelSt98 in #292

Full Changelog: v0.2.1...v0.2.2