Abstract

Convolutional networks for image classification progressively reduce resolution until the image is represented by tiny feature maps in which the spatial structure of the scene is no longer discernible. Such loss of spatial acuity can limit image classification accuracy and complicate the transfer of the model to downstream applications that require detailed scene understanding. These problems can be alleviated by dilation, which increases the resolution of output feature maps without reducing the receptive field of individual neurons. We show that dilated residual networks (DRNs) outperform their non-dilated counterparts in image classification without increasing the models depth or complexity. We then study gridding artifacts introduced by dilation, develop an approach to removing these artifacts (degridding), and show that this further increases the performance of DRNs. In addition, we show that the accuracy advantage of DRNs is further magnified in downstream applications such as object localization and semantic segmentation.

Keywords

Dilation (metric space)ResidualArtificial intelligenceComputer scienceSegmentationFeature (linguistics)Pattern recognition (psychology)Computer visionImage segmentationImage resolutionFeature extractionImage (mathematics)Receptive fieldAlgorithmMathematics

Affiliated Institutions

Related Publications

Publication Info

Year
2017
Type
article
Citations
1692
Access
Closed

External Links

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

1692
OpenAlex

Cite This

Fisher Yu, Vladlen Koltun, Thomas Funkhouser (2017). Dilated Residual Networks. . https://doi.org/10.1109/cvpr.2017.75

Identifiers

DOI
10.1109/cvpr.2017.75