ML Model Extension in FAIRiCUBE #21

cozzolinoac11 · 2024-12-10T17:22:07Z

In the FAIRiCUBE project, we are using ML Model Extension for metadata analysis and processing (a/p) resources and, in particular, for those resources concerning machine learning and deep learning.
Through STAC properties, we have added useful fields to better describe the resource. We want to share our work with the community to get feedback as well as for interoperability purposes (in case someone has similar documentation demands).

In particular:

Platform - platform hosting the resource. It is possible to use a combination of values (e.g., EOX and AWS).
Framework - This field is generally intended as a collection of reusable code written by others. It includes both frameworks, intended as program scaffolds that supply the blueprint of a product, and libraries, intended as collections of pre-defined methods and classes. Notice that the same processing can be done using multiple libraries.
Algorithm - Name of the algorithm
Model configuration - Configuration/initialisation data. How the model has been parameterized
Performance - Result description and explanation, including a detailed description of the hyperparameters used, the run times, the metrics used for evaluation, and the respective scores and performance.
UseConstraints - Possible constraints related to the use of the resource (e.g., the resource works only for certain Input data
the resource needs specific Process of providing computational power)
Validation - Link to a validation report

The following fields are implemented as assets and asset properties:

Input data used - Link to data (or related metadata) to which the a/p resource has been applied. This information is required for a better understanding of the context and domain of the a/p resource.
Characteristics of input data - This field contains a textual description of the main characteristics of each input data to the resource.
Biases and ethical aspects - This field may contain observations on the data and/or any biases found (e.g., class imbalances).
Output data obtained - Link to output data (or related metadata) produced by the execution of the a/p resource. This information is required for a better understanding of the a/p resource.
Characteristics of output data - Textual description of the output data from the resource.

For a detailed look at an example of metadata, the FAIRiCUBE Catalog is available. For example: https://catalog.eoxhub.fairicube.eu/collections/ML%20collection/items/8BLIAOAZJS

fmigneault · 2024-12-10T18:17:22Z

@cozzolinoac11

I recommend the FAIRiCUBE community to consider using https://github.com/stac-extensions/mlm instead.

It is intended as the updated and extended definition of ml-model, with much more properties to describe the model inputs/outputs, the framework/platform/runtime constraints, model hyperparameters, and related data-sources. Basically, most of what I am seeing in your suggestions seems to be covered, and is addressed by mlm, as an effort performed after receiving from users many similar concerns that ml-model is lacking those details. mlm was deemed necessary (rather than updating ml-model) to incorporate important refactors needed to address more recent ML/AI concerns.

Warning

The ml-model extension is not expected to receive further updates. However, mlm is in active development with many participants. Let us know if mlm works for your use case, and if anything seems to be still missing, we can expedite their definition.

Note

Full disclosure:
I am maintainer on both STAC extensions (and others). I joined ml-model maintainers due to the lack of response from original maintainers, while trying to revive the project. Instead, discussions and efforts with the community lead to its mlm replacement. For more detail: https://github.com/orgs/stac-utils/discussions/4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML Model Extension in FAIRiCUBE #21

ML Model Extension in FAIRiCUBE #21

cozzolinoac11 commented Dec 10, 2024

fmigneault commented Dec 10, 2024

ML Model Extension in FAIRiCUBE #21

ML Model Extension in FAIRiCUBE #21

Comments

cozzolinoac11 commented Dec 10, 2024

fmigneault commented Dec 10, 2024