Very Deep Convolutional Networks for Large-Scale Image Recognition

Abstract

In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

Keywords

Computer scienceConvolution (computer science)Convolutional neural networkArtificial intelligenceDeep learningImage (mathematics)Scale (ratio)ArchitectureContextual image classificationPattern recognition (psychology)Machine learningComputer visionArtificial neural networkCartography

Affiliated Institutions

University of Oxford GB

Related Publications

Going deeper with convolutions

Christian Szegedy , Wei Liu , Yangqing Jia +6 more

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Sca...

2015 45596 citations

Efficient object localization using Convolutional Networks

Jonathan Tompson , Ross Goroshin , Arjun Jain +2 more

Recent state-of-the-art performance on human-body pose estimation has been achieved with Deep Convolutional Networks (ConvNets). Traditional ConvNet architectures include poolin...

2015 1324 citations

Adding Conditional Control to Text-to-Image Diffusion Models

Lvmin Zhang , Anyi Rao , Maneesh Agrawala

We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-re...

2023 2023 IEEE/CVF International Conferenc... 2649 citations

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Mingxing Tan , Quoc V. Le

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available. In this paper...

2019 arXiv (Cornell University) 5008 citations

Feature Pyramid Networks for Object Detection

Tsung-Yi Lin , Piotr Dollár , Ross Girshick +3 more

Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But pyramid representations have been avoided in recent object detectors...

2017 2017 IEEE Conference on Computer Visi... 26836 citations

Publication Info

Year: 2014
Type: article
Citations: 75390
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Very Deep Convolutional Networks for Large-Scale Image Recognition

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

75390

OpenAlex

Cite This

APA Style

                            
                                    Karen Simonyan, 
                                
                                    Andrew Zisserman
                                
                            (2014). 
                            Very Deep Convolutional Networks for Large-Scale Image Recognition. 
                            arXiv (Cornell University)
                            
                            .
                            https://doi.org/10.48550/arxiv.1409.1556

Identifiers

DOI: 10.48550/arxiv.1409.1556