diff --git a/Cargo.toml b/Cargo.toml index 76cc2d6..db46199 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -39,9 +39,9 @@ fast_image_resize = { version = "4.2.1", features = ["image"]} natord = "1.0.9" video-rs = { version = "0.10.0", features = ["ndarray"], optional = true } minifb = { version = "0.27.0", optional = true } -argh = "0.1.13" [dev-dependencies] +argh = "0.1.13" env_logger = { version = "0.11.5" } tracing-subscriber = { version = "0.3.18" } tracing = { version = "0.1.40", features = ["log"] } diff --git a/README.md b/README.md index 2b07e7e..fb5a183 100644 --- a/README.md +++ b/README.md @@ -41,58 +41,58 @@ - **OCR Models**: [DB(PaddleOCR-Det)](https://arxiv.org/abs/1911.08947), [SVTR(PaddleOCR-Rec)](https://arxiv.org/abs/2205.00159), [SLANet](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/table_recognition/algorithm_table_slanet.html), [TrOCR](https://huggingface.co/microsoft/trocr-base-printed), [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
-More Supported Models - -| Model | Task / Description | Example | CPU | CoreML | CUDA
FP32 | CUDA
FP16 | TensorRT
FP32 | TensorRT
FP16 | -| -------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------- | --- | ------ | -------------- | -------------- | ------------------ | ------------------ | -| [BEiT](https://github.com/microsoft/unilm/tree/master/beit) | Image Classification | [demo](examples/beit) | ✅ | ✅ | | | | | -| [ConvNeXt](https://github.com/facebookresearch/ConvNeXt) | Image Classification | [demo](examples/convnext) | ✅ | ✅ | | | | | -| [FastViT](https://github.com/apple/ml-fastvit) | Image Classification | [demo](examples/fastvit) | ✅ | ✅ | | | | | -| [MobileOne](https://github.com/apple/ml-mobileone) | Image Classification | [demo](examples/mobileone) | ✅ | ✅ | | | | | -| [DeiT](https://github.com/facebookresearch/deit) | Image Classification | [demo](examples/deit) | ✅ | ✅ | | | | | -| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision Embedding | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [YOLOv5](https://github.com/ultralytics/yolov5) | Image Classification
Object Detection
Instance Segmentation | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [YOLOv6](https://github.com/meituan/YOLOv6) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [YOLOv7](https://github.com/WongKinYiu/yolov7) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [YOLOv8](https://github.com/ultralytics/ultralytics) | Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [YOLOv10](https://github.com/THU-MIG/yolov10) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [YOLOv11](https://github.com/ultralytics/ultralytics) | Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [RT-DETR](https://github.com/lyuwenyu/RT-DETR) | Object Detection | [demo](examples/rtdetr) | ✅ | ✅ | | | | | -| [PP-PicoDet](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.8/configs/picodet) | Object Detection | [demo](examples/picodet-layout) | ✅ | ✅ | | | | | -| [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO) | Object Detection | [demo](examples/picodet-layout) | ✅ | ✅ | | | | | -| [D-FINE](https://github.com/manhbd-22022602/D-FINE) | Object Detection | [demo](examples/d-fine) | ✅ | ✅ | | | | | -| [DEIM](https://github.com/ShihuaHuang95/DEIM) | Object Detection | [demo](examples/deim) | ✅ | ✅ | | | | | -| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | -| [SAM](https://github.com/facebookresearch/segment-anything) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | ✅ | | | -| [SAM2](https://github.com/facebookresearch/segment-anything-2) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | ✅ | | | -| [MobileSAM](https://github.com/ChaoningZhang/MobileSAM) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | ✅ | | | -| [EdgeSAM](https://github.com/chongzhou96/EdgeSAM) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | ✅ | | | -| [SAM-HQ](https://github.com/SysCV/sam-hq) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | ✅ | | | -| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Open-Set Detection With Language | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) | Open-Set Detection With Language | [demo](examples/grounding-dino) | ✅ | ✅ | ✅ | ✅ | | | -| [CLIP](https://github.com/openai/CLIP) | Vision-Language Embedding | [demo](examples/clip) | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | -| [jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1) | Vision-Language Embedding | [demo](examples/clip) | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | -| [BLIP](https://github.com/salesforce/BLIP) | Image Captioning | [demo](examples/blip) | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | -| [DB(PaddleOCR-Det)](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [SVTR(PaddleOCR-Rec)](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [SLANet](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/table_recognition/algorithm_table_slanet.html) | Tabel Recognition | [demo](examples/slanet) | ✅ | ✅ | | | | | -| [TrOCR](https://huggingface.co/microsoft/trocr-base-printed) | Text Recognition | [demo](examples/trocr) | ✅ | ✅ | | | | | -| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic Driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [DepthAnything v1
DepthAnything v2](https://github.com/LiheYoung/Depth-Anything) | Monocular Depth Estimation | [demo](examples/depth-anything) | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | -| [DepthPro](https://github.com/apple/ml-depth-pro) | Monocular Depth Estimation | [demo](examples/depth-pro) | ✅ | ✅ | ✅ | ✅ | | | -| [MODNet](https://github.com/ZHKKKe/MODNet) | Image Matting | [demo](examples/modnet) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| [Sapiens](https://github.com/facebookresearch/sapiens/tree/main) | Foundation for Human Vision Models | [demo](examples/sapiens) | ✅ | ✅ | ✅ | ✅ | | | -| [Florence2](https://arxiv.org/abs/2311.06242) | a Variety of Vision Tasks | [demo](examples/florence2) | ✅ | ✅ | ✅ | ✅ | | | +👉 More Supported Models + +| Model | Task / Description | Example | CoreML | CUDA
FP32 | CUDA
FP16 | TensorRT
FP32 | TensorRT
FP16 | +| -------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- | ---------------------------- | ------ | -------------- | -------------- | ------------------ | ------------------ | +| [BEiT](https://github.com/microsoft/unilm/tree/master/beit) | Image Classification | [demo](examples/beit) | ✅ | ✅ | ✅ | | | +| [ConvNeXt](https://github.com/facebookresearch/ConvNeXt) | Image Classification | [demo](examples/convnext) | ✅ | ✅ | ✅ | | | +| [FastViT](https://github.com/apple/ml-fastvit) | Image Classification | [demo](examples/fastvit) | ✅ | ✅ | ✅ | | | +| [MobileOne](https://github.com/apple/ml-mobileone) | Image Classification | [demo](examples/mobileone) | ✅ | ✅ | ✅ | | | +| [DeiT](https://github.com/facebookresearch/deit) | Image Classification | [demo](examples/deit) | ✅ | ✅ | ✅ | | | +| [DINOv2](https://github.com/facebookresearch/dinov2) | Vision Embedding | [demo](examples/dinov2) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [YOLOv5](https://github.com/ultralytics/yolov5) | Image Classification
Object Detection
Instance Segmentation | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [YOLOv6](https://github.com/meituan/YOLOv6) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [YOLOv7](https://github.com/WongKinYiu/yolov7) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [YOLOv8](https://github.com/ultralytics/ultralytics) | Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [YOLOv10](https://github.com/THU-MIG/yolov10) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [YOLOv9](https://github.com/WongKinYiu/yolov9) | Object Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [YOLOv11](https://github.com/ultralytics/ultralytics) | Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [RT-DETR](https://github.com/lyuwenyu/RT-DETR) | Object Detection | [demo](examples/rtdetr) | ✅ | ✅ | ✅ | | | +| [PP-PicoDet](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.8/configs/picodet) | Object Detection | [demo](examples/picodet-layout) | ✅ | ✅ | ✅ | | | +| [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO) | Object Detection | [demo](examples/picodet-layout) | ✅ | ✅ | ✅ | | | +| [D-FINE](https://github.com/manhbd-22022602/D-FINE) | Object Detection | [demo](examples/d-fine) | ✅ | ✅ | ✅ | | | +| [DEIM](https://github.com/ShihuaHuang95/DEIM) | Object Detection | [demo](examples/deim) | ✅ | ✅ | ✅ | | | +| [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo) | Keypoint Detection | [demo](examples/rtmo) | ✅ | ✅ | ✅ | ❌ | ❌ | +| [SAM](https://github.com/facebookresearch/segment-anything) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | | | +| [SAM2](https://github.com/facebookresearch/segment-anything-2) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | | | +| [MobileSAM](https://github.com/ChaoningZhang/MobileSAM) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | | | +| [EdgeSAM](https://github.com/chongzhou96/EdgeSAM) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | | | +| [SAM-HQ](https://github.com/SysCV/sam-hq) | Segment Anything | [demo](examples/sam) | ✅ | ✅ | ✅ | | | +| [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) | Instance Segmentation | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [YOLO-World](https://github.com/AILab-CVC/YOLO-World) | Open-Set Detection With Language | [demo](examples/yolo) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO) | Open-Set Detection With Language | [demo](examples/grounding-dino) | ✅ | ✅ | ✅ | | | +| [CLIP](https://github.com/openai/CLIP) | Vision-Language Embedding | [demo](examples/clip) | ✅ | ✅ | ✅ | ❌ | ❌ | +| [jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1) | Vision-Language Embedding | [demo](examples/clip) | ✅ | ✅ | ✅ | ❌ | ❌ | +| [BLIP](https://github.com/salesforce/BLIP) | Image Captioning | [demo](examples/blip) | ✅ | ✅ | ✅ | ❌ | ❌ | +| [DB(PaddleOCR-Det)](https://arxiv.org/abs/1911.08947) | Text Detection | [demo](examples/db) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [SVTR(PaddleOCR-Rec)](https://arxiv.org/abs/2205.00159) | Text Recognition | [demo](examples/svtr) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [SLANet](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/table_recognition/algorithm_table_slanet.html) | Tabel Recognition | [demo](examples/slanet) | ✅ | ✅ | ✅ | | | +| [TrOCR](https://huggingface.co/microsoft/trocr-base-printed) | Text Recognition | [demo](examples/trocr) | ✅ | ✅ | ✅ | | | +| [YOLOPv2](https://arxiv.org/abs/2208.11434) | Panoptic Driving Perception | [demo](examples/yolop) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [DepthAnything v1
DepthAnything v2](https://github.com/LiheYoung/Depth-Anything) | Monocular Depth Estimation | [demo](examples/depth-anything) | ✅ | ✅ | ✅ | ❌ | ❌ | +| [DepthPro](https://github.com/apple/ml-depth-pro) | Monocular Depth Estimation | [demo](examples/depth-pro) | ✅ | ✅ | ✅ | | | +| [MODNet](https://github.com/ZHKKKe/MODNet) | Image Matting | [demo](examples/modnet) | ✅ | ✅ | ✅ | ✅ | ✅ | +| [Sapiens](https://github.com/facebookresearch/sapiens/tree/main) | Foundation for Human Vision Models | [demo](examples/sapiens) | ✅ | ✅ | ✅ | | | +| [Florence2](https://arxiv.org/abs/2311.06242) | a Variety of Vision Tasks | [demo](examples/florence2) | ✅ | ✅ | ✅ | | |
## ⛳️ Cargo Features -By default, none of the following features are enabled. You can enable them as needed: +By default, **none of the following features are enabled**. You can enable them as needed: -- `auto`: Automatically downloads prebuilt ONNXRuntime binaries from Pyke’s CDN for supported platforms. +- **`auto`**: Automatically downloads prebuilt ONNXRuntime binaries from Pyke’s CDN for supported platforms. - If disabled, you'll need to [compile `ONNXRuntime` from source](https://github.com/microsoft/onnxruntime) or [download a precompiled package](https://github.com/microsoft/onnxruntime/releases), and then [link it manually](https://ort.pyke.io/setup/linking). @@ -106,18 +106,35 @@ By default, none of the following features are enabled. You can enable them as n ``` -- `ffmpeg`: Adds support for video streams, real-time frame visualization, and video export. +- **`ffmpeg`**: Adds support for video streams, real-time frame visualization, and video export. - Powered by [video-rs](https://github.com/oddity-ai/video-rs) and [minifb](https://github.com/emoon/rust_minifb). For any issues related to `ffmpeg` features, please refer to the issues of these two crates. -- `cuda`: Enables the NVIDIA TensorRT provider. -- `trt`: Enables the NVIDIA TensorRT provider. -- `mps`: Enables the Apple CoreML provider. +- **`cuda`**: Enables the NVIDIA TensorRT provider. +- **`trt`**: Enables the NVIDIA TensorRT provider. +- **`mps`**: Enables the Apple CoreML provider. -## 🎈 Demo +## 🎈 Example -```Shell -cargo run -r -F cuda --example svtr -- --device cuda -``` +* **Using `CUDA`** + + ``` + cargo run -r -F cuda --example yolo -- --device cuda:0 + ``` +* **Using Apple `CoreML`** + + ``` + cargo run -r -F mps --example yolo -- --device mps + ``` +* **Using `TensorRT`** + + ``` + cargo run -r -F trt --example yolo -- --device trt + ``` +* **Using `CPU`** + + ``` + cargo run -r --example yolo + ``` All examples are located in the [examples](./examples/) directory. @@ -126,7 +143,7 @@ All examples are located in the [examples](./examples/) directory. Add `usls` as a dependency to your project's `Cargo.toml` ```Shell -cargo add usls +cargo add usls -F cuda ``` Or use a specific commit: diff --git a/examples/yolo/README.md b/examples/yolo/README.md index 15be924..5151aa6 100644 --- a/examples/yolo/README.md +++ b/examples/yolo/README.md @@ -1,34 +1,29 @@

YOLO-Series

+| Detection | Instance Segmentation | Pose | +| :----------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------: | +| `` | `` | `` | -| Detection | Instance Segmentation | Pose | -| :---------------: | :------------------------: |:---------------: | -| | | | - -| Classification | Obb | -| :------------------------: |:------------------------: | -| | - -| Head Detection | Fall Detection | Trash Detection | -| :------------------------: |:------------------------: |:------------------------: | -| || - -| YOLO-World | Face Parsing | FastSAM | -| :------------------------: |:------------------------: |:------------------------: | -| || - - +| Classification | Obb | +| :----------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------: | +| `` | `` | +| Head Detection | Fall Detection | Trash Detection | +| :-----------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------: | +| `` | `` | `` | +| YOLO-World | Face Parsing | FastSAM | +| :-------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------: | +| `` | `` | `` | ## Quick Start + ```Shell -# customized -cargo run -r --example yolo -- --task detect --ver v8 --nc 6 --model xxx.onnx # YOLOv8 +# Your customized YOLOv8 model +cargo run -r --example yolo -- --task detect --ver v8 --num-classes 6 --model xxx.onnx # YOLOv8 # Classify - cargo run -r --example yolo -- --task classify --ver 5 --scale s --image-width 224 --image-height 224 --num-classes 1000 --use-imagenet-1k-classes # YOLOv5 cargo run -r --example yolo -- --task classify --ver 8 --scale n --image-width 224 --image-height 224 # YOLOv8 cargo run -r --example yolo -- --task classify --ver 11 --scale n --image-width 224 --image-height 224 # YOLOv11 @@ -59,79 +54,12 @@ cargo run -r --example yolo -- --ver 11 --task obb --scale n --image-width 1024 **`cargo run -r --example yolo -- --help` for more options** - ## Other YOLOv8 Solution Models -| Model | Weights | Datasets| -|:---------------------: | :--------------------------: | :-------------------------------: | -| Face-Landmark Detection | [yolov8-face-dyn-f16](https://github.com/jamjamjon/assets/releases/download/yolo/v8-n-face-dyn-f16.onnx) | | -| Head Detection | [yolov8-head-f16](https://github.com/jamjamjon/assets/releases/download/yolo/v8-head-f16.onnx) | | -| Fall Detection | [yolov8-falldown-f16](https://github.com/jamjamjon/assets/releases/download/yolo/v8-falldown-f16.onnx) | | -| Trash Detection | [yolov8-plastic-bag-f16](https://github.com/jamjamjon/assets/releases/download/yolo/v8-plastic-bag-f16.onnx) | | -| FaceParsing | [yolov8-face-parsing-dyn](https://github.com/jamjamjon/assets/releases/download/yolo/v8-face-parsing-dyn.onnx) | [CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ/tree/master/face_parsing)
[[Processed YOLO labels]](https://github.com/jamjamjon/assets/releases/download/yolo/CelebAMask-HQ-YOLO-Labels.zip)[[Python Script]](../../scripts/CelebAMask-HQ-To-YOLO-Labels.py) | - - - - -## Export ONNX Models - - -
-YOLOv5 - -[Here](https://docs.ultralytics.com/yolov5/tutorials/model_export/) - -
- - -
-YOLOv6 - -[Here](https://github.com/meituan/YOLOv6/tree/main/deploy/ONNX) - -
- - -
-YOLOv7 - -[Here](https://github.com/WongKinYiu/yolov7?tab=readme-ov-file#export) - -
- -
-YOLOv8, YOLOv11 - -```Shell -pip install -U ultralytics - -# export onnx model with dynamic shapes -yolo export model=yolov8m.pt format=onnx simplify dynamic -yolo export model=yolov8m-cls.pt format=onnx simplify dynamic -yolo export model=yolov8m-pose.pt format=onnx simplify dynamic -yolo export model=yolov8m-seg.pt format=onnx simplify dynamic -yolo export model=yolov8m-obb.pt format=onnx simplify dynamic - -# export onnx model with fixed shapes -yolo export model=yolov8m.pt format=onnx simplify -yolo export model=yolov8m-cls.pt format=onnx simplify -yolo export model=yolov8m-pose.pt format=onnx simplify -yolo export model=yolov8m-seg.pt format=onnx simplify -yolo export model=yolov8m-obb.pt format=onnx simplify -``` -
- - -
-YOLOv9 - -[Here](https://github.com/WongKinYiu/yolov9/blob/main/export.py) - -
- -
-YOLOv10 - -[Here](https://github.com/THU-MIG/yolov10#export) - -
+| Model | Weights | +| :---------------------: | :------------------------------------------------------: | +| Face-Landmark Detection | [yolov8-n-face](https://github.com/jamjamjon/assets/releases/download/yolo/v8-n-face-fp16.onnx) | +| Head Detection | [yolov8-head](https://github.com/jamjamjon/assets/releases/download/yolo/v8-head-fp16.onnx) | +| Fall Detection | [yolov8-falldown](https://github.com/jamjamjon/assets/releases/download/yolo/v8-falldown-fp16.onnx) | +| Trash Detection | [yolov8-plastic-bag](https://github.com/jamjamjon/assets/releases/download/yolo/v8-plastic-bag-fp16.onnx) | +| FaceParsing | [yolov8-face-parsing-seg](https://github.com/jamjamjon/assets/releases/download/yolo/v8-face-parsing.onnx) | diff --git a/examples/yolo/main.rs b/examples/yolo/main.rs index cbee159..512904d 100644 --- a/examples/yolo/main.rs +++ b/examples/yolo/main.rs @@ -122,7 +122,7 @@ struct Args { fn main() -> Result<()> { tracing_subscriber::fmt() - .with_max_level(tracing::Level::ERROR) + .with_max_level(tracing::Level::INFO) .init(); let args: Args = argh::from_env(); diff --git a/src/lib.rs b/src/lib.rs index b3689c9..c3a657e 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -1,10 +1,43 @@ -//! **usls** is a Rust library integrated with **ONNXRuntime** that provides a collection of state-of-the-art models for **Computer Vision** and **Vision-Language** tasks, including: +//! **usls** is a Rust library integrated with **ONNXRuntime**, offering a suite of advanced models for **Computer Vision** and **Vision-Language** tasks, including: //! -//! - **YOLO Models**: [YOLOv5](https://github.com/ultralytics/yolov5), [YOLOv6](https://github.com/meituan/YOLOv6), [YOLOv7](https://github.com/WongKinYiu/yolov7), [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [YOLOv10](https://github.com/THU-MIG/yolov10) +//! - **YOLO Models**: [YOLOv5](https://github.com/ultralytics/yolov5), [YOLOv6](https://github.com/meituan/YOLOv6), [YOLOv7](https://github.com/WongKinYiu/yolov7), [YOLOv8](https://github.com/ultralytics/ultralytics), [YOLOv9](https://github.com/WongKinYiu/yolov9), [YOLOv10](https://github.com/THU-MIG/yolov10), [YOLO11](https://github.com/ultralytics/ultralytics) //! - **SAM Models**: [SAM](https://github.com/facebookresearch/segment-anything), [SAM2](https://github.com/facebookresearch/segment-anything-2), [MobileSAM](https://github.com/ChaoningZhang/MobileSAM), [EdgeSAM](https://github.com/chongzhou96/EdgeSAM), [SAM-HQ](https://github.com/SysCV/sam-hq), [FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) -//! - **Vision Models**: [RTDETR](https://arxiv.org/abs/2304.08069), [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo), [DB](https://arxiv.org/abs/1911.08947), [SVTR](https://arxiv.org/abs/2205.00159), [Depth-Anything-v1-v2](https://github.com/LiheYoung/Depth-Anything), [DINOv2](https://github.com/facebookresearch/dinov2), [MODNet](https://github.com/ZHKKKe/MODNet), [Sapiens](https://arxiv.org/abs/2408.12569) -//! - **Vision-Language Models**: [CLIP](https://github.com/openai/CLIP), [BLIP](https://arxiv.org/abs/2201.12086), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [Florence2](https://arxiv.org/abs/2311.06242) +//! - **Vision Models**: [RT-DETR](https://arxiv.org/abs/2304.08069), [RTMO](https://github.com/open-mmlab/mmpose/tree/main/projects/rtmo), [Depth-Anything](https://github.com/LiheYoung/Depth-Anything), [DINOv2](https://github.com/facebookresearch/dinov2), [MODNet](https://github.com/ZHKKKe/MODNet), [Sapiens](https://arxiv.org/abs/2408.12569), [DepthPro](https://github.com/apple/ml-depth-pro), [FastViT](https://github.com/apple/ml-fastvit), [BEiT](https://github.com/microsoft/unilm/tree/master/beit), [MobileOne](https://github.com/apple/ml-mobileone) +//! - **Vision-Language Models**: [CLIP](https://github.com/openai/CLIP), [jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1), [BLIP](https://arxiv.org/abs/2201.12086), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [YOLO-World](https://github.com/AILab-CVC/YOLO-World), [Florence2](https://arxiv.org/abs/2311.06242) +//! - **OCR Models**: [DB(PaddleOCR-Det)](https://arxiv.org/abs/1911.08947), [SVTR(PaddleOCR-Rec)](https://arxiv.org/abs/2205.00159), [SLANet](https://paddlepaddle.github.io/PaddleOCR/latest/algorithm/table_recognition/algorithm_table_slanet.html), [TrOCR](https://huggingface.co/microsoft/trocr-base-printed), [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO) //! +//! ## ⛳️ Cargo Features +//! +//! By default, **none of the following features are enabled**. You can enable them as needed: +//! +//! - **`auto`**: Automatically downloads prebuilt ONNXRuntime binaries from Pyke’s CDN for supported platforms. +//! +//! - If disabled, you'll need to [compile `ONNXRuntime` from source](https://github.com/microsoft/onnxruntime) or [download a precompiled package](https://github.com/microsoft/onnxruntime/releases), and then [link it manually](https://ort.pyke.io/setup/linking). +//! +//!
+//! 👉 For Linux or macOS Users +//! +//! - Download from the [Releases page](https://github.com/microsoft/onnxruntime/releases). +//! - Set up the library path by exporting the `ORT_DYLIB_PATH` environment variable: +//! ```shell +//! export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.20.1 +//! ``` +//! +//!
+//! - **`ffmpeg`**: Adds support for video streams, real-time frame visualization, and video export. +//! +//! - Powered by [video-rs](https://github.com/oddity-ai/video-rs) and [minifb](https://github.com/emoon/rust_minifb). For any issues related to `ffmpeg` features, please refer to the issues of these two crates. +//! - **`cuda`**: Enables the NVIDIA TensorRT provider. +//! - **`trt`**: Enables the NVIDIA TensorRT provider. +//! - **`mps`**: Enables the Apple CoreML provider. +//! +//! ## 🎈 Example +//! +//! ```Shell +//! cargo run -r -F cuda --example svtr -- --device cuda +//! ``` +//! +//! All examples are located in the [examples](https://github.com/jamjamjon/usls/tree/main/examples) directory. mod misc; pub mod models;