Example Morpheus pipeline using Triton Inference server and Morpheus.
Environment | Supported | Notes |
---|---|---|
Conda | ✔ | |
Morpheus Docker Container | ✔ | Requires launching Triton on the host |
Morpheus Release Container | ✔ | Requires launching Triton on the host |
Dev Container | ✔ | Requires using the dev-triton-start script. If using the run.py script this requires adding the --server_url=triton:8000 flag. If using the CLI example this requires replacing --server_url=localhost:8000 with --server_url=triton:8000 |
Pull the Morpheus Triton models Docker image from NGC.
Example:
docker pull nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:25.02
From the Morpheus repo root directory, run the following to launch Triton and load the log-parsing-onnx
model:
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:25.02 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model log-parsing-onnx
Once Triton server finishes starting up, it will display the status of all loaded models. Successful deployment of the model will show the following:
+------------------+---------+--------+
| Model | Version | Status |
+------------------+---------+--------+
| log-parsing-onnx | 1 | READY |
+------------------+---------+--------+
Note: If this is not present in the output, check the Triton log for any error messages related to loading the model.
Run the following from the root of the Morpheus repo to start the log parsing pipeline:
python examples/log_parsing/run.py \
--input_file=./examples/data/log-parsing-validation-data-input.csv \
--model_vocab_hash_file=data/bert-base-cased-hash.txt \
--model_vocab_file=data/bert-base-cased-vocab.txt \
--model_name log-parsing-onnx \
--model_config_file=./examples/data/log-parsing-config-20220418.json
Use --help
to display information about the command line options:
python run.py --help
Options:
--num_threads INTEGER RANGE Number of internal pipeline threads to use
[x>=1]
--pipeline_batch_size INTEGER RANGE
Internal batch size for the pipeline. Can be
much larger than the model batch size. Also
used for Kafka consumers [x>=1]
--model_max_batch_size INTEGER RANGE
Max batch size to use for the model [x>=1]
--input_file PATH Input filepath [required]
--output_file TEXT The path to the file where the inference
output will be saved.
--model_vocab_hash_file FILE Model vocab hash file to use for pre-
processing [required]
--model_vocab_file FILE Model vocab file to use for post-processing
[required]
--model_seq_length INTEGER RANGE
Sequence length to use for the model [x>=1]
--model_name TEXT The name of the model that is deployed on
Triton server [required]
--model_config_file TEXT Model config file [required]
--server_url TEXT Tritonserver url [required]
--help Show this message and exit.
The above example is illustrative of using the Python API to build a custom Morpheus pipeline. Alternately, the Morpheus command line could have been used to accomplish the same goal. To do this we must ensure the examples/log_parsing
directory is available in the PYTHONPATH
and each of the custom stages are registered as plugins.
From the root of the Morpheus repo, run:
PYTHONPATH="examples/log_parsing" \
morpheus --log_level INFO \
--plugin "inference" \
--plugin "postprocessing" \
run --pipeline_batch_size 1024 --model_max_batch_size 32 \
pipeline-nlp \
from-file --filename ./models/datasets/validation-data/log-parsing-validation-data-input.csv \
deserialize \
preprocess --vocab_hash_file data/bert-base-cased-hash.txt --stride 64 --column=raw \
monitor --description "Preprocessing rate" \
inf-logparsing --model_name log-parsing-onnx --server_url localhost:8001 --force_convert_inputs=True \
monitor --description "Inference rate" --unit inf \
log-postprocess --vocab_path ./models/training-tuning-scripts/sid-models/resources/bert-base-cased-vocab.txt \
--model_config_path=./models/log-parsing-models/log-parsing-config-20220418.json \
to-file --filename .tmp/output/log-parsing-cli-output.jsonlines --overwrite \
monitor --description "Postprocessing rate"