Abstract

We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.

Keywords

Convolution (computer science)PointwiseSeparable spaceConvolutional neural networkComputer scienceArtificial intelligenceDeep learningInterpretation (philosophy)Pattern recognition (psychology)Image (mathematics)AlgorithmArtificial neural networkMathematicsMathematical analysis

Affiliated Institutions

Related Publications

Publication Info

Year
2017
Type
article
Pages
1800-1807
Citations
17644
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

17644
OpenAlex

Cite This

François Chollet (2017). Xception: Deep Learning with Depthwise Separable Convolutions. , 1800-1807. https://doi.org/10.1109/cvpr.2017.195

Identifiers

DOI
10.1109/cvpr.2017.195