DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Abstract

In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7 percent mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.

Keywords

Conditional random fieldArtificial intelligenceComputer sciencePattern recognition (psychology)UpsamplingConvolutional neural networkPascal (unit)Markov random fieldSegmentationConvolution (computer science)CRFSImage segmentationContextual image classificationPoolingObject detectionFeature (linguistics)Feature extractionImage (mathematics)Artificial neural network

Affiliated Institutions

Related Publications

Rethinking Atrous Convolution for Semantic Image Segmentation

Liang-Chieh Chen , George Papandreou , Florian Schroff +1 more

In this work, we revisit atrous convolution, a powerful tool to explicitly adjust filter's field-of-view as well as control the resolution of feature responses computed by Deep ...

2017 arXiv (Cornell University) 7401 citations

DenseASPP for Semantic Segmentation in Street Scenes

Maoke Yang , Kun Yu , Chi Zhang +2 more

Semantic image segmentation is a basic street scene understanding task in autonomous driving, where each pixel in a high resolution image is categorized into a set of semantic l...

2018 2018 IEEE/CVF Conference on Computer ... 1566 citations

Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation

Guosheng Lin , Chunhua Shen , Anton van den Hengel +1 more

Recent advances in semantic image segmentation have mostly been achieved by training deep convolutional neural networks (CNNs). We show how to improve semantic segmentation thro...

2016 844 citations

Context Encoding for Semantic Segmentation

Hang Zhang , Kristin Dana , Jianping Shi +4 more

Recent work has made significant progress in improving spatial resolution for pixelwise labeling with Fully Convolutional Network (FCN) framework by employing Dilated/Atrous con...

2018 2018 IEEE/CVF Conference on Computer ... 1436 citations

Large Kernel Matters — Improve Semantic Segmentation by Global Convolutional Network

Chao Peng , Xiangyu Zhang , Gang Yu +2 more

One of recent trends [31, 32, 14] in network architecture design is stacking small filters (e.g., 1×1 or 3×3) in the entire network because the stacked small filters is more eff...

2017 1691 citations

Publication Info

Year: 2017
Type: article
Volume: 40
Issue: 4
Pages: 834-848
Citations: 20855
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

20855

OpenAlex

Cite This

APA Style

                            
                                
                                    Liang-Chieh Chen, 
                                
                                    George Papandreou, 
                                
                                    Iasonas Kokkinos
                                
                                et al.
                            
                            (2017). 
                            DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. 
                            IEEE Transactions on Pattern Analysis and Machine Intelligence
                            , 40
                            (4)
                            , 834-848.
                            https://doi.org/10.1109/tpami.2017.2699184
                        

Identifiers

DOI: 10.1109/tpami.2017.2699184