Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Abstract

Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

Keywords

Pascal (unit)Computer scienceConvolutional neural networkObject detectionArtificial intelligencePattern recognition (psychology)SegmentationScalabilityContext (archaeology)Image (mathematics)Machine learning

Affiliated Institutions

Berkeley College US

Related Publications

Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks

Maxime Oquab , Léon Bottou , Ivan Laptev +1 more

Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success o...

2014 3151 citations

Learning Deconvolution Network for Semantic Segmentation

Hyeonwoo Noh , Seunghoon Hong , Bohyung Han

We propose a novel semantic segmentation algorithm by learning a deep deconvolution network. We learn the network on top of the convolutional layers adopted from VGG 16-layer ne...

2015 3978 citations

Is object localization for free? - Weakly-supervised learning with convolutional neural networks

Maxime Oquab , Léon Bottou , Ivan Laptev +1 more

Successful methods for visual object recognition typically rely on training datasets containing lots of richly annotated images. Detailed image annotation, e.g. by object boundi...

2015 915 citations

CBAM: Convolutional Block Attention Module

Sanghyun Woo , Jongchan Park , Joon‐Young Lee +1 more

We propose Convolutional Block Attention Module (CBAM), a simple yet effective attention module for feed-forward convolutional neural networks. Given an intermediate feature map...

2018 Lecture notes in computer science 20102 citations

Object Detection With Deep Learning: A Review

Zhong‐Qiu Zhao , Peng Zheng , Shou-Tao Xu +1 more

Due to object detection's close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection ...

2019 IEEE Transactions on Neural Networks ... 5019 citations

Publication Info

Year: 2014
Type: article
Pages: 580-587
Citations: 30615
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

30615

OpenAlex

Cite This

APA Style

                            
                                    Ross Girshick, 
                                
                                    Jeff Donahue, 
                                
                                    Trevor Darrell
                                
                                et al.
                            
                            (2014). 
                            Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 
                            
                            , 580-587.
                            https://doi.org/10.1109/cvpr.2014.81

Identifiers

DOI: 10.1109/cvpr.2014.81