AI runtime configuration scheme #373
Replies: 4 comments 3 replies
-
In terms of resolution priority, we should do (from higher priority to lower):
|
Beta Was this translation helpful? Give feedback.
-
To streamline this config mgmt part, let's try to leverage Pydantic Settings as much as possible, as it already covers many of the pain points. |
Beta Was this translation helpful? Give feedback.
-
After taking into account the proposed ideas, here is an updated implementation plan to support AI runtime configuration: Changes for
Changes for
|
Beta Was this translation helpful? Give feedback.
-
This was implemented with docling v2.12, closing. |
Beta Was this translation helpful? Give feedback.
-
Objective
Introduce the logic to do the resolution of input parameters and decide the AI runtime configuration based on:
Proposal
A. Introduce input parameters in
docling
package:PipelineOptions
to introduce the fields:device
,num_threads
.device
parameter can be anEnum
with values:[AUTO, CUDA, CPU, MPS]
. The default value isAUTO
.num_threads
parameter is an integer. The default value is 4.[DOCLING_DEVICE, DOCLING_NUM_THREADS]
or[DOCLING_DEVICE, OMP_NUM_THREADS]
.device
andnum_threads
.B. Introduce configuration resolution logic as a utility function in
docling-ibm-models
package:[CUDA, MPS, CPU]
.5a. If
device == AUTO
, use the first available device according to the precedence.5b. If
device
is explicitly set by the user but it is not available in the system, replace it with the next available device.C. Usage:
pipeline_options
.docling-ibm-models
is used to resolve the runtime configuration.docling-ibm-models
package or by the "models" inside thedocling
package.Beta Was this translation helpful? Give feedback.
All reactions