Duankaiwen/CenterNet
Codes for our paper “CenterNet: Keypoint Triplets for Object Detection” .
repo name | Duankaiwen/CenterNet |
repo link | https://github.com/Duankaiwen/CenterNet |
homepage | |
language | Python |
size (curr.) | 5308 kB |
stars (curr.) | 1438 |
created | 2019-04-16 |
license | |
CenterNet: Keypoint Triplets for Object Detection
by Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang and Qi Tian
The code to train and evaluate the proposed CenterNet is available here. For more technical details, please refer to our arXiv paper.
We thank Princeton Vision & Learning Lab for providing the original implementation of CornerNet.
CenterNet is an one-stage detector which gets trained from scratch. On the MS-COCO dataset, CenterNet achieves an AP of 47.0%, which surpasses all known one-stage detectors, and even gets very close to the top-performance two-stage detectors.
Abstract
In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions. This paper presents an efficient solution which explores the visual patterns within each cropped region with minimal costs. We build our framework upon a representative one-stage keypoint-based detector named CornerNet. Our approach, named CenterNet, detects each object as a triplet, rather than a pair, of keypoints, which improves both precision and recall. Accordingly, we design two customized modules named cascade corner pooling and center pooling, which play the roles of enriching information collected by both top-left and bottom-right corners and providing more recognizable information at the central regions, respectively. On the MS-COCO dataset, CenterNet achieves an AP of 47.0%, which outperforms all existing one-stage detectors by a large margin. Meanwhile, with a faster inference speed, CenterNet demonstrates quite comparable performance to the top-ranked two-stage detectors.
Introduction
CenterNet is a framework for object detection with deep convolutional neural networks. You can use the code to train and evaluate a network for object detection on the MS-COCO dataset.
-
It achieves state-of-the-art performance (an AP of 47.0%) on one of the most challenging dataset: MS-COCO.
-
Our code is written in Python, based on CornerNet.
More detailed descriptions of our approach and code will be made available soon.
If you encounter any problems in using our code, please contact Kaiwen Duan: kaiwen.duan@vipl.ict.ac.cn.
Architecture
Comparison with other methods
In terms of speed, we test the inference speed of both CornerNet and CenterNet on a NVIDIA Tesla P100 GPU. We obtain that the average inference time of CornerNet511-104 (means that the resolution of input images is 511X511 and the backbone is Hourglass-104) is 300ms per image and that of CenterNet511-104 is 340ms. Meanwhile, using the Hourglass-52 backbone can speed up the inference speed. Our CenterNet511-52 takes an average of 270ms to process per image, which is faster and more accurate than CornerNet511-104.
Preparation
Please first install Anaconda and create an Anaconda environment using the provided package list.
conda create --name CenterNet --file conda_packagelist.txt
After you create the environment, activate it.
source activate CenterNet
Compiling Corner Pooling Layers
cd <CenterNet dir>/models/py_utils/_cpools/
python setup.py install --user
Compiling NMS
cd <CenterNet dir>/external
make
Installing MS COCO APIs
cd <CenterNet dir>/data/coco/PythonAPI
make
Downloading MS COCO Data
- Download the training/validation split we use in our paper from here (originally from Faster R-CNN)
- Unzip the file and place
annotations
under<CenterNet dir>/data/coco
- Download the images (2014 Train, 2014 Val, 2017 Test) from here
- Create 3 directories,
trainval2014
,minival2014
andtestdev2017
, under<CenterNet dir>/data/coco/images/
- Copy the training/validation/testing images to the corresponding directories according to the annotation files
Training and Evaluation
To train CenterNet-104:
python train.py CenterNet-104
We provide the configuration file (CenterNet-104.json
) and the model file (CenterNet-104.py
) for CenterNet in this repo.
We also provide a trained model for CenterNet-104
, which is trained for 480k iterations using 8 Tesla V100 (32GB) GPUs. You can download it from BaiduYun CenterNet-104 (code: bfko) or Google drive CenterNet-104 and put it under <CenterNet dir>/cache/nnet
(You may need to create this directory by yourself if it does not exist). If you want to train you own CenterNet, please adjust the batch size in CenterNet-104.json
to accommodate the number of GPUs that are available to you.
To use the trained model:
python test.py CenterNet-104 --testiter 480000 --split <split>
To train CenterNet-52:
python train.py CenterNet-52
We provide the configuration file (CenterNet-52.json
) and the model file (CenterNet-52.py
) for CenterNet in this repo.
We also provide a trained model for CenterNet-52
, which is trained for 480k iterations using 8 Tesla V100 (32GB) GPUs. You can download it from BaiduYun CenterNet-52 (code: 680t) or Google Drive CenterNet-52 and put it under <CenterNet dir>/cache/nnet
(You may need to create this directory by yourself if it does not exist). If you want to train you own CenterNet, please adjust the batch size in CenterNet-52.json
to accommodate the number of GPUs that are available to you.
To use the trained model:
python test.py CenterNet-52 --testiter 480000 --split <split>
We also include a configuration file for multi-scale evaluation, which is CenterNet-104-multi_scale.json
and CenterNet-52-multi_scale.json
in this repo, respectively.
To use the multi-scale configuration file:
python test.py CenterNet-52 --testiter <iter> --split <split> --suffix multi_scale
or
python test.py CenterNet-104 --testiter <iter> --split <split> --suffix multi_scale