Skip to content

Commit

Permalink
Release of v2.23.0
Browse files Browse the repository at this point in the history
Release of v2.23.0
  • Loading branch information
ZwwWayne authored Mar 30, 2022
2 parents 6b87ac2 + bab144c commit 3e26931
Show file tree
Hide file tree
Showing 113 changed files with 4,169 additions and 406 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,7 @@ jobs:
- name: Build and install
run: pip install -e .
- name: Run unittests
run: coverage run --branch --source mmdet -m pytest tests -sv
run: coverage run --branch --source mmdet -m pytest tests
- name: Generate coverage report
run: |
coverage xml
Expand Down
18 changes: 10 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,17 @@
<img src="resources/mmdet-logo.png" width="600"/>
<div>&nbsp;</div>
<div align="center">
<b><font size="5">OpenMMLab website</font></b>
<b>OpenMMLab website</b>
<sup>
<a href="https://openmmlab.com">
<i><font size="4">HOT</font></i>
<i>HOT</i>
</a>
</sup>
&nbsp;&nbsp;&nbsp;&nbsp;
<b><font size="5">OpenMMLab platform</font></b>
<b>OpenMMLab platform</b>
<sup>
<a href="https://platform.openmmlab.com">
<i><font size="4">TRY IT OUT</font></i>
<i>TRY IT OUT</i>
</a>
</sup>
</div>
Expand Down Expand Up @@ -74,11 +74,11 @@ This project is released under the [Apache 2.0 license](LICENSE).

## Changelog

**2.22.0** was released in 24/2/2022:
**2.23.0** was released in 28/3/2022:

- Support [MaskFormer](configs/maskformer), [DyHead](configs/dyhead), [OpenImages Dataset](configs/openimages) and [TIMM backbone](configs/timm_example)
- Support visualization for Panoptic Segmentation
- Release a good recipe of using ResNet in object detectors pre-trained by [ResNet Strikes Back](https://arxiv.org/abs/2110.00476), which consistently brings about 3~4 mAP improvements over RetinaNet, Faster/Mask/Cascade Mask R-CNN
- Support [Mask2Former](configs/mask2former) and [EfficientNet](configs/efficientnet)
- Support setting data root through environment variable `MMDET_DATASETS`, users don't have to modify the corresponding path in config files anymore.
- Find a good recipe for fine-tuning high precision ResNet backbone pre-trained by Torchvision.

Please refer to [changelog.md](docs/en/changelog.md) for details and release history.

Expand Down Expand Up @@ -164,6 +164,7 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
<ul>
<li><a href="configs/panoptic_fpn">Panoptic FPN (CVPR'2019)</a></li>
<li><a href="configs/maskformer">MaskFormer (NeurIPS'2021)</a></li>
<li><a href="configs/mask2former">Mask2Former (ArXiv'2021)</a></li>
</ul>
</td>
<td>
Expand Down Expand Up @@ -228,6 +229,7 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
<li><a href="configs/swin">Swin (CVPR'2021)</a></li>
<li><a href="configs/pvt">PVTv2 (ArXiv'2021)</a></li>
<li><a href="configs/resnet_strikes_back">ResNet strikes back (ArXiv'2021)</a></li>
<li><a href="configs/efficientnet">EfficientNet (ArXiv'2021)</a></li>
</ul>
</td>
<td>
Expand Down
13 changes: 8 additions & 5 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,13 +73,13 @@ MMDetection 是一个基于 PyTorch 的目标检测开源工具箱。它是 [Ope

## 更新日志

最新的 **2.22.0** 版本已经在 2022.02.24 发布:
最新的 **2.23.0** 版本已经在 2022.03.28 发布:

- 支持 [MaskFormer](configs/maskformer)[DyHead](configs/dyhead)[OpenImages Dataset](configs/openimages) [TIMM backbone](configs/timm_example)
- 支持全景分割可视化
- 发布了一个在目标检测任务中使用 ResNet 的好方法,它是由 [ResNet Strikes Back](https://arxiv.org/abs/2110.00476) 预训练的,并且能稳定的在 RetinaNet, Faster/Mask/Cascade Mask R-CNN 上带来约 3-4 mAP 的提升
- 支持 [Mask2Former](configs/mask2former) [Efficientnet](configs/efficientnet)
- 支持通环境变量 `MMDET_DATASETS` 设置数据根目录,因此无需修改配置文件中对应的路径。
- 发现一个很好的方法来微调由 Torchvision 预训练的高精度 ResNet 主干。

如果想了解更多版本更新细节和历史信息,请阅读[更新日志](docs/changelog.md)
如果想了解更多版本更新细节和历史信息,请阅读[更新日志](docs/en/changelog.md)

如果想了解 MMDetection 不同版本之间的兼容性, 请参考[兼容性说明文档](docs/zh_cn/compatibility.md)

Expand Down Expand Up @@ -162,6 +162,8 @@ MMDetection 是一个基于 PyTorch 的目标检测开源工具箱。它是 [Ope
<td>
<ul>
<li><a href="configs/panoptic_fpn">Panoptic FPN (CVPR'2019)</a></li>
<li><a href="configs/maskformer">MaskFormer (NeurIPS'2021)</a></li>
<li><a href="configs/mask2former">Mask2Former (ArXiv'2021)</a></li>
</ul>
</td>
<td>
Expand Down Expand Up @@ -226,6 +228,7 @@ MMDetection 是一个基于 PyTorch 的目标检测开源工具箱。它是 [Ope
<li><a href="configs/swin">Swin (CVPR'2021)</a></li>
<li><a href="configs/pvt">PVTv2 (ArXiv'2021)</a></li>
<li><a href="configs/resnet_strikes_back">ResNet strikes back (ArXiv'2021)</a></li>
<li><a href="configs/efficientnet">EfficientNet (ArXiv'2021)</a></li>
</ul>
</td>
<td>
Expand Down
30 changes: 30 additions & 0 deletions configs/efficientnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# EfficientNet

> [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946v5)
<!-- [BACKBONE] -->

## Introduction

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper, we systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. Based on this observation, we propose a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient. We demonstrate the effectiveness of this method on scaling up MobileNets and ResNet.

To go even further, we use neural architecture search to design a new baseline network and scale it up to obtain a family of models, called EfficientNets, which achieve much better accuracy and efficiency than previous ConvNets. In particular, our EfficientNet-B7 achieves state-of-the-art 84.3% top-1 accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on inference than the best existing ConvNet. Our EfficientNets also transfer well and achieve state-of-the-art accuracy on CIFAR-100 (91.7%), Flowers (98.8%), and 3 other transfer learning datasets, with an order of magnitude fewer parameters.

## Results and Models

### RetinaNet

| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Config | Download |
| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :------: | :--------: |
|Efficientnet-b3 | pytorch | 1x | - | - | 40.5 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/efficientnet/retinanet_effb3_fpn_crop896_8x4_1x_coco.py) | [model]() &#124; [log]() |

## Citation

```latex
@article{tan2019efficientnet,
title={Efficientnet: Rethinking model scaling for convolutional neural networks},
author={Tan, Mingxing and Le, Quoc V},
journal={arXiv preprint arXiv:1905.11946},
year={2019}
}
```
19 changes: 19 additions & 0 deletions configs/efficientnet/metafile.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Models:
- Name: retinanet_effb3_fpn_crop896_8x4_1x_coco
In Collection: RetinaNet
Config: configs/efficientnet/retinanet_effb3_fpn_crop896_8x4_1x_coco.py
Metadata:
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 40.5
Weights: url
Paper:
URL: https://arxiv.org/abs/1905.11946v5
Title: 'EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks'
README: configs/efficientnet/README.md
Code:
URL: https://github.com/open-mmlab/mmdetection/blob/v2.23.0/mmdet/models/backbones/efficientnet.py#L159
Version: v2.23.0
93 changes: 93 additions & 0 deletions configs/efficientnet/retinanet_effb3_fpn_crop896_8x4_1x_coco.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
_base_ = [
'../_base_/models/retinanet_r50_fpn.py',
'../_base_/datasets/coco_detection.py', '../_base_/default_runtime.py'
]

cudnn_benchmark = True
norm_cfg = dict(type='BN', requires_grad=True)
checkpoint = 'https://download.openmmlab.com/mmclassification/v0/efficientnet/efficientnet-b3_3rdparty_8xb32-aa_in1k_20220119-5b4887a0.pth' # noqa
model = dict(
backbone=dict(
_delete_=True,
type='EfficientNet',
arch='b3',
drop_path_rate=0.2,
out_indices=(3, 4, 5),
frozen_stages=0,
norm_cfg=dict(
type='SyncBN', requires_grad=True, eps=1e-3, momentum=0.01),
norm_eval=False,
init_cfg=dict(
type='Pretrained', prefix='backbone', checkpoint=checkpoint)),
neck=dict(
in_channels=[48, 136, 384],
start_level=0,
out_channels=256,
relu_before_extra_convs=True,
no_norm_on_lateral=True,
norm_cfg=norm_cfg),
bbox_head=dict(type='RetinaSepBNHead', num_ins=5, norm_cfg=norm_cfg),
# training and testing settings
train_cfg=dict(assigner=dict(neg_iou_thr=0.5)))

# dataset settings
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
img_size = (896, 896)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='Resize',
img_scale=img_size,
ratio_range=(0.8, 1.2),
keep_ratio=True),
dict(type='RandomCrop', crop_size=img_size),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=img_size),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=img_size,
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size=img_size),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
train=dict(pipeline=train_pipeline),
val=dict(pipeline=test_pipeline),
test=dict(pipeline=test_pipeline))
# optimizer
optimizer_config = dict(grad_clip=None)
optimizer = dict(
type='SGD',
lr=0.04,
momentum=0.9,
weight_decay=0.0001,
paramwise_cfg=dict(norm_decay_mult=0, bypass_duplicate=True))
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=1000,
warmup_ratio=0.1,
step=[8, 11])
# runtime settings
runner = dict(type='EpochBasedRunner', max_epochs=12)

# NOTE: This variable is for automatically scaling LR,
# USER SHOULD NOT CHANGE THIS VALUE.
default_batch_size = 32 # (8 GPUs) x (4 samples per GPU)
8 changes: 8 additions & 0 deletions configs/faster_rcnn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,14 @@ We further finetune some pre-trained models on the COCO subsets, which only cont
| [R-50-FPN](./faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-person.py) | caffe | person | [R-50-FPN-Caffe-3x](./faster_rcnn_r50_caffe_fpn_mstrain_3x_coco.py) | 3.7 | 55.8 | [config](./faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-person.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person/faster_rcnn_r50_fpn_1x_coco-person_20201216_175929-d022e227.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person/faster_rcnn_r50_fpn_1x_coco-person_20201216_175929.log.json) |
| [R-50-FPN](./faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-person-bicycle-car.py) | caffe | person-bicycle-car | [R-50-FPN-Caffe-3x](./faster_rcnn_r50_caffe_fpn_mstrain_3x_coco.py) | 3.7 | 44.1 | [config](./faster_rcnn_r50_caffe_fpn_mstrain_1x_coco-person-bicycle-car.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person-bicycle-car/faster_rcnn_r50_fpn_1x_coco-person-bicycle-car_20201216_173117-6eda6d92.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person-bicycle-car/faster_rcnn_r50_fpn_1x_coco-person-bicycle-car_20201216_173117.log.json) |

## Torchvision New Receipe (TNR)

Torchvision released its high-precision ResNet models. The training details can be found on the [Pytorch website](https://pytorch.org/blog/how-to-train-state-of-the-art-models-using-torchvision-latest-primitives/). Here, we have done grid searches on learning rate and weight decay and found the optimal hyper-parameter on the detection task.

| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Config | Download |
| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :------: | :--------: |
| [R-50-TNR](./faster_rcnn_r50_fpn_tnr-pretrain_1x_coco.py) | pytorch | 1x | - | | 40.2 | [config](https://github.com/open-mmlab/mmdetection/blob/master/configs/faster_rcnn/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco_20220320_085147-efedfda4.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco_20220320_085147-efedfda4.log.json) |

## Citation

```latex
Expand Down
17 changes: 17 additions & 0 deletions configs/faster_rcnn/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
_base_ = [
'../_base_/models/faster_rcnn_r50_fpn.py',
'../_base_/datasets/coco_detection.py',
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]

checkpoint = 'https://download.pytorch.org/models/resnet50-11ad3fa6.pth'
model = dict(
backbone=dict(init_cfg=dict(type='Pretrained', checkpoint=checkpoint)))

# `lr` and `weight_decay` have been searched to be optimal.
optimizer = dict(
_delete_=True,
type='AdamW',
lr=0.0001,
weight_decay=0.1,
paramwise_cfg=dict(norm_decay_mult=0., bypass_duplicate=True))
20 changes: 20 additions & 0 deletions configs/faster_rcnn/metafile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -405,3 +405,23 @@ Models:
Metrics:
box AP: 43.1
Weights: https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_64x4d_fpn_mstrain_3x_coco/faster_rcnn_x101_64x4d_fpn_mstrain_3x_coco_20210524_124528-26c63de6.pth

- Name: faster_rcnn_r50_fpn_tnr-pretrain_1x_coco
In Collection: Faster R-CNN
Config: configs/faster_rcnn/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco.py
Metadata:
Training Memory (GB): 4.0
inference time (ms/im):
- value: 46.73
hardware: V100
backend: PyTorch
batch size: 1
mode: FP32
resolution: (800, 1333)
Epochs: 12
Results:
- Task: Object Detection
Dataset: COCO
Metrics:
box AP: 40.2
Weights: https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco/faster_rcnn_r50_fpn_tnr-pretrain_1x_coco_20220320_085147-efedfda4.pth
Loading

0 comments on commit 3e26931

Please sign in to comment.