SMGUP-net : slightly modified GUP-net

As part of the course deep learning for autonomous vehicles, the class as a whole creates the visualization interface of an autonomous vehicle. The project is broken down into sub- sections addressing the different topics. This project focuses on 3D object detection using a monocular camera. The goal is to recognize agent categories like cars, pedestrians and cycles from a RGB monocular camera and identifying their size and location in space with 3D bounding boxes.

Run the code

Dataset

The code has been trained and tested using the KITTI dataset. To run the code, the left color images, camera calibration matrices and the training labels are needed (~12Gb).

file structure

 KITTI/
     ImageSets/
         test.txt
         train.txt
         val.txt
     data/
         calib/
         image_2/
         label_2/

The ImageSets folder contains the samples split indexes to take the samples in the data folder. Those files are given.

The KITTI dataset is shuffled from the start. Having no access to the official testing set, we unshuffeled the images to combine them back in ordered sequences, separated them in training, validation and test sets and then shuffled back. Like this we don't have any images from the same videos on the test and train dataset.

Environement preparation

The required libraries can be installed with:

yes | pip install -r requirements.txt

Path configuration

To be able to run the code, the path needs to be set in the configuration file:

SMGUP_net/experiment/config.yaml the path to the KITTI folder has to be set (KITTI not included) and you can choose were you want to store the log and the output.

This configuration file controls all main parameters of the network.

Run

By default, simply execute the code in training mode with:

python main.py

If the configuration is located at a specific location, execute:

python main.py --config <your_path_relative_to_main.py>

To run the code in evaluation/inference mode, set:

python main.py -e

Pre-trained model

We provied the weight of the trained model on google Drive

Contribution

We implemented two main contributions. The first one leverage the classification of KITTI's objects into three categories: easy, moderate and hard, depending on the occlusion level, the truncation and the pixel size. From that we apply a curriculum learning during the training epochs. First only the easy objects are trained, then only the moderates ones, then only the hard ones and finally all of them are used for the remaining epochs. Unfortunately we didn't achieve better results for all the category, the cyclists and pedestrians are better but not the cars (see the report for more information on implementation).

The second contribution is to add the detection of all the other labels of KITTI, the initial code detecting only the cyclists, pedestrians and cars. The results of these different contribution can be seen below with the AP 3D metric.

model	Cars @IoU=0.7	Cyclists @IoU=0.5	Pedestrians @IoU=0.5

easy
GUP-net	22.22	0.77	7.16
GUP-net curriculum	21.39	2.08	10.73
GUP-net all labels	21.35	4.52	8.82

medium
GUP-net	16.32	0.46	2.66
GUP-net curriculum	15.80	1.32	3.92
GUP-net all labels	17.71	2.84	4.55

hard
GUP-net	15.10	0.46	2.22
GUP-net curriculum	14.46	1.38	3.33
GUP-net all labels	15.60	2.84	4.55

Evaluation and Visualization

To evaluate the model you have to run the compiled c file evaluation_metrics/evaluate_3D_AP40 with the folowing command line :

./evaluate_3D_AP40 "<your path to label_2 folder (included)>" "<your path to predicted data folder (not included)"

To visualize the images with the 2D and 3D predicted boxes you can use the matlab file matlab_visualization/run_visualization.m and change the root_dir variable with your path to the KITTI folder. Set also the path to the labels to show with the variable label_dir.

Citation

@article{lu2021geometry,
title={Geometry Uncertainty Projection Network for Monocular 3D Object Detection},
author={Lu, Yan and Ma, Xinzhu and Yang, Lei and Zhang, Tianzhu and Liu, Yating and Chu, Qi and Yan, Junjie and Ouyang, Wanli},
journal={arXiv preprint arXiv:2107.13774},year={2021}}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
KITTI		KITTI
SMGUP_net		SMGUP_net
evaluation_metrics		evaluation_metrics
img		img
matlab_visualization		matlab_visualization
.gitattributes		.gitattributes
.gitignore		.gitignore
Milestone_2_report.pdf		Milestone_2_report.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMGUP-net : slightly modified GUP-net

Run the code

Dataset

file structure

Environement preparation

Path configuration

Run

Pre-trained model

Contribution

Evaluation and Visualization

Citation

About

Releases

Packages

Contributors 2

Languages

vita-student-projects/3D_Object_Detection_using_Monocular_Camera_gr12

Folders and files

Latest commit

History

Repository files navigation

SMGUP-net : slightly modified GUP-net

Run the code

Dataset

file structure

Environement preparation

Path configuration

Run

Pre-trained model

Contribution

Evaluation and Visualization

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages