Abstract

Compared with model architectures, the training process, which is also crucial to the success of detectors, has received relatively less attention in object detection. In this work, we carefully revisit the standard training practice of detectors, and find that the detection performance is often limited by the imbalance during the training process, which generally consists in three levels - sample level, feature level, and objective level. To mitigate the adverse effects caused thereby, we propose Libra R-CNN, a simple but effective framework towards balanced learning for object detection. It integrates three novel components: IoU-balanced sampling, balanced feature pyramid, and balanced L1 loss, respectively for reducing the imbalance at sample, feature, and objective level. Benefitted from the overall balanced design, Libra R-CNN significantly improves the detection performance. Without bells and whistles, it achieves 2.5 points and 2.0 points higher Average Precision (AP) than FPN Faster R-CNN and RetinaNet respectively on MSCOCO.

Keywords

Object detectionComputer sciencePyramid (geometry)Feature (linguistics)Artificial intelligenceProcess (computing)Object (grammar)DetectorSample (material)Deep learningPattern recognition (psychology)Feature extractionFeature learningMachine learningComputer visionMathematicsTelecommunications

Affiliated Institutions

Related Publications

Publication Info

Year
2019
Type
article
Pages
821-830
Citations
1634
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1634
OpenAlex
195
Influential
1363
CrossRef

Cite This

Jiangmiao Pang, Kai Chen, Jianping Shi et al. (2019). Libra R-CNN: Towards Balanced Learning for Object Detection. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 821-830. https://doi.org/10.1109/cvpr.2019.00091

Identifiers

DOI
10.1109/cvpr.2019.00091
arXiv
1904.02701

Data Quality

Data completeness: 84%