October 31, 2021

848 words 4 mins read

WongKinYiu/yolor

WongKinYiu/yolor

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)

repo name WongKinYiu/yolor
repo link https://github.com/WongKinYiu/yolor
homepage
language Python
size (curr.) 3345 kB
stars (curr.) 1011
created 2021-04-12
license GNU General Public License v3.0

YOLOR

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks

PWC

Unified Network

To get the results on the table, please use this branch.

Model Test Size APtest AP50test AP75test batch1 throughput batch32 inference
YOLOR-P6 1280 54.1% 71.8% 59.3% 49 fps 8.3 ms
YOLOR-W6 1280 55.5% 73.2% 60.6% 47 fps 10.7 ms
YOLOR-E6 1280 56.4% 74.1% 61.6% 37 fps 17.1 ms
YOLOR-D6 1280 57.3% 75.0% 62.7% 30 fps 21.8 ms
YOLOR-D6* 1280 57.8% 75.5% 63.3% 30 fps 21.8 ms
YOLOv4-P5 896 51.8% 70.3% 56.6% 41 fps -
YOLOv4-P6 1280 54.5% 72.6% 59.8% 30 fps -
YOLOv4-P7 1536 55.5% 73.4% 60.8% 16 fps -

To reproduce the inference speed, please see darknet.

Model Test Size APval AP50val AP75val APSval APMval APLval batch1 throughput
YOLOv4-CSP 640 49.1% 67.7% 53.8% 32.1% 54.4% 63.2% 76 fps
YOLOR-CSP 640 49.2% 67.6% 53.7% 32.9% 54.4% 63.0% weights
YOLOR-CSP* 640 50.0% 68.7% 54.3% 34.2% 55.1% 64.3% weights
YOLOv4-CSP-X 640 50.9% 69.3% 55.4% 35.3% 55.8% 64.8% 53 fps
YOLOR-CSP-X 640 51.1% 69.6% 55.7% 35.7% 56.0% 65.2% weights
YOLOR-CSP-X* 640 51.5% 69.9% 56.1% 35.8% 56.8% 66.1% weights

Developing…

Model Test Size APtest AP50test AP75test APStest APMtest APLtest
YOLOR-CSP 640 51.1% 69.6% 55.7% 31.7% 55.3% 64.7%
YOLOR-CSP-X 640 53.0% 71.4% 57.9% 33.7% 57.1% 66.8%

Train from scratch for 300 epochs…

Model Info Test Size AP
YOLOR-CSP evolution 640 48.0%
YOLOR-CSP strategy 640 50.0%
YOLOR-CSP strategy + simOTA 640 51.1%
YOLOR-CSP-X strategy 640 51.5%
YOLOR-CSP-X strategy + simOTA 640 53.0%

Installation

Docker environment (recommended)

# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolor -it -v your_coco_path/:/coco/ -v your_code_path/:/yolor --shm-size=64g nvcr.io/nvidia/pytorch:20.11-py3

# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx

# pip install required packages
pip install seaborn thop

# install mish-cuda if you want to use mish activation
# https://github.com/thomasbrandon/mish-cuda
# https://github.com/JunnYu/mish-cuda
cd /
git clone https://github.com/JunnYu/mish-cuda
cd mish-cuda
python setup.py build install

# install pytorch_wavelets if you want to use dwt down-sampling module
# https://github.com/fbcotter/pytorch_wavelets
cd /
git clone https://github.com/fbcotter/pytorch_wavelets
cd pytorch_wavelets
pip install .

# go to code folder
cd /yolor

Colab environment

git clone https://github.com/WongKinYiu/yolor
cd yolor

# pip install required packages
pip install -qr requirements.txt

# install mish-cuda if you want to use mish activation
# https://github.com/thomasbrandon/mish-cuda
# https://github.com/JunnYu/mish-cuda
git clone https://github.com/JunnYu/mish-cuda
cd mish-cuda
python setup.py build install
cd ..

# install pytorch_wavelets if you want to use dwt down-sampling module
# https://github.com/fbcotter/pytorch_wavelets
git clone https://github.com/fbcotter/pytorch_wavelets
cd pytorch_wavelets
pip install .
cd ..

Prepare COCO dataset

cd /yolor
bash scripts/get_coco.sh

Prepare pretrained weight

cd /yolor
bash scripts/get_pretrain.sh

Testing

yolor_p6.pt

python test.py --data data/coco.yaml --img 1280 --batch 32 --conf 0.001 --iou 0.65 --device 0 --cfg cfg/yolor_p6.cfg --weights yolor_p6.pt --name yolor_p6_val

You will get the results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.52510
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.70718
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.57520
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.37058
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.56878
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.66102
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.39181
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.65229
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.71441
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.57755
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.75337
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.84013

Training

Single GPU training:

python train.py --batch-size 8 --img 1280 1280 --data coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0 --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300

Multiple GPU training:

python -m torch.distributed.launch --nproc_per_node 2 --master_port 9527 train.py --batch-size 16 --img 1280 1280 --data coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0,1 --sync-bn --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300

Training schedule in the paper:

python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 tune.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights 'runs/train/yolor_p6/weights/last_298.pt' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6-tune --hyp hyp.finetune.1280.yaml --epochs 450
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights 'runs/train/yolor_p6-tune/weights/epoch_424.pt' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6-fine --hyp hyp.finetune.1280.yaml --epochs 450

Inference

yolor_p6.pt

python detect.py --source inference/images/horses.jpg --cfg cfg/yolor_p6.cfg --weights yolor_p6.pt --conf 0.25 --img-size 1280 --device 0

You will get the results:

horses

Citation

@article{wang2021you,
  title={You Only Learn One Representation: Unified Network for Multiple Tasks},
  author={Wang, Chien-Yao and Yeh, I-Hau and Liao, Hong-Yuan Mark},
  journal={arXiv preprint arXiv:2105.04206},
  year={2021}
}

Acknowledgements

comments powered by Disqus