Workflow-Api for Workflow-Assets (Code2Data, Code2Compute) #3219
reisman234
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Through the Gaia-X4KI project, I have been working on the topic of making compute resources available in a data space for some time. A data space participant can thus provide its own (private) infrastructure for use cases. In this way, a kind of Code2Compute or Code2Data is realised in the data space. For the first case, "Code2Compute", only the computing infrastructure is used to execute an application. The infrastructure can contain special hardware that is needed for the execution. An example could be a simulation application. In the second case, in addition to the infrastructure, the participant also offers data that cannot easily be shared directly because it is too large or because it is not allowed to leave the company.
Through the experience gained with the EDC-Connector and the discussion held to further refine an initial concept, the following shows a first result of that solution.
The EDC connector serves as a gateway to the data space. A so-called WorkflowAsset is registered on the provider side. This special asset is configured with a provisioner and thus executes an additional provision process when a transfer request is made. The WorkflowProvisioner, which implements an HTTP provisioner from EDC, deploys a WorkflowAPI specifically for this consumer based on the request and registers the requested asset for it. The required information, such as the Consumer_ID, which the consumer links to a WorkflowApi, and the Asset_ID can be obtained using the Provider Connector for the current provision process.
After provisioning and with the end of the transfer, the consumer receives the required information to reach his WorkflowApi.
With the provisioned WorkflowApi, the consumer is able to interact with registered workflow assets via an API. Workflow assets define the application to be executed (container image) and its required hardware resources. It also defines, in the form of input and output resources, which input data are determined or expected, as well as the results generated by the application.
Via the API, a consumer can upload data for a workflow asset, execute the actual app, view its status and logs if necessary, and terminate it again. When the workflow is executed, the corresponding workflow backend (currently only Kubernetes) performs a deployment in the cluster. This means: In the specific K8s workflow backend, input resources are prepared based on the workflow asset and the pod manifest is created (worker-image, hardware requirements). If the application terminates without errors, result data is loaded into the user's memory, from where the user can download it accordingly.
I will provide asap a video which demonstrate the whole process.
Beta Was this translation helpful? Give feedback.
All reactions