-
Notifications
You must be signed in to change notification settings - Fork 163
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #202 from miniMaddy/madhav_obj_detection
Adding chapter for object detection.
- Loading branch information
Showing
1 changed file
with
47 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# Object Detection | ||
|
||
In this chapter, we'll explore the fascinating world of object detection—a vital task in modern computer vision systems. We will demystify essential concepts, discuss popular methods, examine applications, and discuss evaluation metrics. By the end, you'll have a solid foundation and be ready to venture further into advanced topics. | ||
|
||
![Image displaying the bounding boxes around multiple objects in the frame along with the confidence score of their classification](https://huggingface.co/datasets/hf-vision/course-assets/resolve/main/Object_Detection.png) | ||
## Object Detection Overview | ||
|
||
### Introduction | ||
|
||
Object detection is the task of identifying and locating specific objects within digital images or video frames. It has far-reaching implications across diverse sectors, including self-driving cars, facial recognition systems, and medical diagnosis tools. | ||
|
||
### Classification vs Localization | ||
|
||
Classification distinguishes objects based on unique attributes, while localization determines an object's location within an image. Object detection combines both approaches, locating entities and assigning corresponding class labels. Imagine recognizing different fruit types and pinpointing their exact locations in a single image. That's object detection at play! | ||
|
||
## Use Cases | ||
|
||
Object detection impacts numerous industries, offering valuable insights and automation opportunities. Representative examples include autonomous vehicles navigating roads, surveillance systems covering vast public spaces, healthcare imaging systems detecting diseases, manufacturing plants maintaining output consistency, and augmented reality enriching user experiences. | ||
|
||
Here is an example of object detection using transformers: | ||
```python | ||
from transformers import pipeline | ||
from PIL import Image | ||
pipe = pipeline('object-detection', model="facebook/detr-resnet-50") | ||
|
||
image = Image.open("path/to/your/image.jpg").convert("RGB") | ||
|
||
bounding_boxes = pipe(image) | ||
``` | ||
|
||
## How to Evaluate an Object Detection Model? | ||
You have now seen how to use an object detection model, but how can you evaluate it? As demonstrated in the previous section, object detection is primarily a supervised learning task. This means that the dataset is composed of images and their corresponding bounding boxes, which serve as the ground truth. A few metrics can be used to evaluate your model. The most common ones are: | ||
|
||
- **The Intersection over Union (IoU) or Jaccard index** measures the overlap between predicted and reference labels as a percentage ranging from 0% to 100%. Higher IoU percentages indicate better alignments, i.e., improved accuracy. Useful when assessing tracker performance under changing conditions, e.g., following wild animals during migration. | ||
|
||
- **Mean Average Precision (mAP)** estimates object detection efficiency using both precision (correct prediction ratio) and recall (true positive identification ability). Calculated across varying IoU thresholds, mAP functions as a holistic assessment tool for object detection algorithms. Helpful for determining the model's performance in localization and detection in challenging conditions like finding irregular surface defects that vary in size and shape in a manufactured part. | ||
|
||
## Conclusion and Future Work | ||
|
||
Understanding object detection lays the groundwork for mastering advanced computer vision techniques, enabling the construction of powerful and accurate solutions addressing rigorous needs. Some future research areas include developing lightweight object detection models which are fast and easily deployable. Exploration in the field of object detection in 3D space, e.g., for augmented reality applications, is another avenue to explore. | ||
|
||
## References and Additional Resources | ||
|
||
- [Hugging Face Object Detection Guide](https://huggingface.co/docs/transformers/tasks/object_detection) | ||
- [Object Detection in 20 Years: A Survey](https://arxiv.org/abs/1905.05055) | ||
- [Papers with Code - Real-Time Object Detection](https://paperswithcode.com/task/real-time-object-detection) | ||
- [Papers with Code - Object Detection](https://paperswithcode.com/task/object-detection) |