Skip to content

Commit

Permalink
Merge pull request #574 from microsoft/staging
Browse files Browse the repository at this point in the history
Small edits to readmes (#573)
  • Loading branch information
PatrickBue authored Jun 26, 2020
2 parents 109f42d + 9ada614 commit 16d2caf
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 7 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<img src="scenarios/media/logo_cvbp.png" align="right" alt="" width="300"/>

```diff
+ Update June 24: Added action recognition as new core scenario.
+ Update June 24: Added action recognition as new core scenario.
+ Object tracking coming soon (in 2-4 weeks).
```

Expand Down Expand Up @@ -37,7 +37,7 @@ Our target audience for this repository includes data scientists and machine lea
To get started, navigate to the [Setup Guide](SETUP.md), which lists
instructions on how to setup the compute environment and dependencies needed to run the
notebooks in this repo. Once your environment is setup, navigate to the
[Scenarios](scenarios) folder and start exploring the notebooks.
[Scenarios](scenarios) folder and start exploring the notebooks. We recommend to start with the *image classification* notebooks, since this introduces concepts which are also used by the other scenarios (e.g. pre-training on ImageNet).

Alternatively, we support Binder
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/PatrickBue/computervision-recipes/master?filepath=scenarios%2Fclassification%2F01_training_introduction_BINDER.ipynb)
Expand Down
10 changes: 5 additions & 5 deletions scenarios/tracking/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,23 @@
# Multi-Object Tracking

```diff
+ June 2020: This work is ongoing.
+ June 2020: All notebooks/code in this directory is work-in-progress and might not fully execute.
```

This directory provides examples and best practices for building multi-object tracking systems. Our goal is to enable the users to bring their own datasets and train a high-accuracytracking model easily. While there are many open-source trackers available, we have implemented the [FairMOT tracker](https://github.com/ifzhang/FairMOT) specifically, as its algorithm has shown competitive tracking performance in recent MOT benchmarking challenges, at fast inference speed.

## Technology
Multi-object-tracking (MOT) is one of the hot research topics in Computer Vision, due to its wide applications in autonomous driving, traffic surveillance, etc. It builds on object detection technology, in order to detect and track all objects in a dynamic scene over time. Inferring target trajectories correctly across successive image frames remains challenging: occlusion happens when objects overlap; the number of and appearance of objects can change. Compared to object detection algorithms, which aim to output rectangular bounding boxes around the objects, MOT algorithms additionally associated an ID number to each box to identify that specific object across the image frames.
Multi-object-tracking (MOT) is one of the hot research topics in Computer Vision, due to its wide applications in autonomous driving, traffic surveillance, etc. It builds on object detection technology, in order to detect and track all objects in a dynamic scene over time. Inferring target trajectories correctly across successive image frames remains challenging: occlusion happens when objects overlap; the number of and appearance of objects can change. Compared to object detection algorithms, which aim to output rectangular bounding boxes around the objects, MOT algorithms additionally associated an ID number to each box to identify that specific object across the image frames.

As seen in the figure below ([Ciaparrone, 2019](https://arxiv.org/pdf/1907.12740.pdf)), a typical multi-object-tracking algorithm performs part or all of the following steps:
* Detection: Given the input raw image frames (step 1), the detector identifies object(s) on each image frame as bounding box(es) (step 2).
* Feature extraction/motion prediction: For every detected object, visual appearance and motion features are extracted (step 3). Sometimes, a motion predictor (e.g. Kalman Filter) is also added to predict the next position of each tracked target.
* Affinity: The feature and motion predictions are used to calculate similarity/distance scores between pairs of detections and/or tracklets, or the probabilities of detections belonging to a given target or tracklet (step 4).
* Feature extraction/motion prediction: For every detected object, visual appearance and motion features are extracted (step 3). Sometimes, a motion predictor (e.g. Kalman Filter) is also added to predict the next position of each tracked target.
* Affinity: The feature and motion predictions are used to calculate similarity/distance scores between pairs of detections and/or tracklets, or the probabilities of detections belonging to a given target or tracklet (step 4).
* Association: Based on these scores/probabilities, a specific numerical ID is assigned to each detected object as it is tracked across successive image frames (step 5).

<p align="center">
<img src="./media/figure_motmodules2.jpg" width="700" align="center"/>
</p>
</p>


## State-of-the-art (SoTA)
Expand Down

0 comments on commit 16d2caf

Please sign in to comment.