Abstract

Deep Convolutional Networks (ConvNets) are fundamental to, besides large-scale visual recognition, a lot of vision tasks. As the primary goal of the ConvNets is to characterize complex boundaries of thousands of classes in a high-dimensional space, it is critical to learn higher-order representations for enhancing non-linear modeling capability. Recently, Global Second-order Pooling (GSoP), plugged at the end of networks, has attracted increasing attentions, achieving much better performance than classical, first-order networks in a variety of vision tasks. However, how to effectively introduce higher-order representation in earlier layers for improving non-linear capability of ConvNets is still an open problem. In this paper, we propose a novel network model introducing GSoP across from lower to higher layers for exploiting holistic image information throughout a network. Given an input 3D tensor outputted by some previous convolutional layer, we perform GSoP to obtain a covariance matrix which, after nonlinear transformation, is used for tensor scaling along channel dimension. Similarly, we can perform GSoP along spatial dimension for tensor scaling as well. In this way, we can make full use of the second-order statistics of the holistic image throughout a network. The proposed networks are thoroughly evaluated on large-scale ImageNet-1K, and experiments have shown that they outperform non-trivially the counterparts while achieving state-of-the-art results.

Keywords

PoolingComputer scienceTensor (intrinsic definition)Convolutional neural networkArtificial intelligenceRepresentation (politics)Transformation (genetics)ScalingDimension (graph theory)CovariancePattern recognition (psychology)Machine learningTheoretical computer scienceMathematics

Affiliated Institutions

Related Publications

The scaled unscented transformation

This paper describes a generalisation of the unscented transformation (UT) which allows sigma points to be scaled to an arbitrary dimension. The UT is a method for predicting me...

2002 1097 citations

Publication Info

Year
2019
Type
article
Pages
3019-3028
Citations
461
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

461
OpenAlex

Cite This

Zilin Gao, Jiangtao Xie, Qilong Wang et al. (2019). Global Second-Order Pooling Convolutional Networks. , 3019-3028. https://doi.org/10.1109/cvpr.2019.00314

Identifiers

DOI
10.1109/cvpr.2019.00314