GitHub - clxia12/RT-DETRv3: Official implementation of the WACV 2025 ( Oral ) paper. RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision.

RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision

🔥🔥[WACV 2025 Oral] The official implementation of the paper "RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision".
[arXiv]

Model Zoo on COCO

Model	Epoch	Backbone	Input shape	$AP^{val}$	$AP^{val}_{50}$	Params(M)	FLOPs(G)	T4 TensorRT FP16(FPS)	Weight	Config
RT-DETRv3-R18	6x	ResNet-18	640	48.1	66.2	20	60	217	baidu 网盘 google drive	config
RT-DETRv3-R34	6x	ResNet-34	640	49.9	67.7	31	92	161	baidu 网盘 google drive	config
RT-DETRv3-R50	6x	ResNet-50	640	53.4	71.7	42	136	108	baidu 网盘 google drive	config
RT-DETRv3-R101	6x	ResNet-101	640	54.6	73.1	76	259	74		config

Notes:

RT-DETRv3 uses 4 GPUs for training.
RT-DETRv3 was trained on COCO train2017 and evaluated on val2017.

Model Zoo on LVIS

Model	Epoch	Backbone	Input shape	AP	$AP_{r}$	$AP_{c}$	$AP_{f}$	Weight	Config	Log
RT-DETRv3-R18	6x	ResNet-18	640	26.5	12.5	24.3	35.2		config
RT-DETRv3-R50	6x	ResNet-50	640	33.9	20.2	32.5	41.5		config

Quick start

Install requirements

pip install -r requirements.txt

Compile (optional)

cd ./ppdet/modeling/transformers/ext_op/

python setup_ms_deformable_attn_op.py install

See details

Data preparation

Download and extract COCO 2017 train and val images.

path/to/coco/
  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images

Modify config dataset_dir

Training & Evaluation & Testing

Training on a Single GPU:

# training on single-GPU
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/rtdetrv3/rtdetrv3_r18vd_6x_coco.yml --eval

Training on Multiple GPUs:

# training on multi-GPU
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rtdetrv3/rtdetrv3_r18vd_6x_coco.yml --fleet --eval

Evaluation:

python tools/eval.py -c configs/rtdetrv3/rtdetrv3_r18vd_6x_coco.yml \
              -o weights=https://bj.bcebos.com/v1/paddledet/models/rtdetrv3_r18vd_6x_coco.pdparams

Inference:

python tools/infer.py -c configs/rtdetrv3/rtdetrv3_r18vd_6x_coco.yml \
              -o weights=https://bj.bcebos.com/v1/paddledet/models/rtdetrv3_r18vd_6x_coco.pdparams \
              --infer_img=./demo/000000570688.jpg

Deploy

1. Export model

python tools/export_model.py -c configs/rtdetrv3/rtdetrv3_r18vd_6x_coco.yml \
              -o weights=https://bj.bcebos.com/v1/paddledet/models/rtdetrv3_r18vd_6x_coco.pdparams trt=True \
              --output_dir=output_inference

2. Convert to ONNX

Install Paddle2ONNX and ONNX

pip install onnx==1.13.0
pip install paddle2onnx==1.0.5

Convert:

paddle2onnx --model_dir=./output_inference/rtdetrv3_r18vd_6x_coco/ \
            --model_filename model.pdmodel  \
            --params_filename model.pdiparams \
            --opset_version 16 \
            --save_file rtdetrv3_r18vd_6x_coco.onnx

3. Convert to TensorRT

TensorRT version >= 8.5.1
Inference can refer to Bennchmark

trtexec --onnx=./rtdetrv3_r18vd_6x_coco.onnx \
        --workspace=4096 \
        --shapes=image:1x3x640x640 \
        --saveEngine=rtdetrv3_r18vd_6x_coco.trt \
        --avgRuns=100 \
        --fp16

Citation

If you find RT-DETRv3 useful in your research, please consider giving a star ⭐ and citing:

@article{wang2024rt,
  title={RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision},
  author={Wang, Shuo and Xia, Chunlong and Lv, Feng and Shi, Yifeng},
  journal={arXiv preprint arXiv:2409.08475},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
configs		configs
dataset		dataset
ppdet		ppdet
scripts		scripts
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision

Model Zoo on COCO

Model Zoo on LVIS

Quick start

Deploy

Citation

About

Releases

Packages

Contributors 2

Languages

License

clxia12/RT-DETRv3

Folders and files

Latest commit

History

Repository files navigation

RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision

Model Zoo on COCO

Model Zoo on LVIS

Quick start

Deploy

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages