Abstract

Real-time object detection is one of the most important research topics in computer vision. As new approaches regarding architecture optimization and training optimization are continually being developed, we have found two research topics that have spawned when dealing with these latest state-of-the-art methods. To address the topics, we propose a trainable bag-of-freebies oriented solution. We combine the flexible and efficient training tools with the proposed architecture and the compound scaling method. YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 120 FPS and has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100. Source code is released in https://github.com/WongKinYiu/yolov7.

Keywords

Computer scienceDetectorObject detectionArchitectureObject (grammar)Code (set theory)Artificial intelligenceSource codeState (computer science)Range (aeronautics)Computer visionPattern recognition (psychology)Set (abstract data type)Programming languageEngineering

Affiliated Institutions

Related Publications

Publication Info

Year
2023
Type
article
Pages
7464-7475
Citations
9475
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

9475
OpenAlex
729
Influential
7983
CrossRef

Cite This

Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao (2023). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 7464-7475. https://doi.org/10.1109/cvpr52729.2023.00721

Identifiers

DOI
10.1109/cvpr52729.2023.00721
arXiv
2207.02696

Data Quality

Data completeness: 88%