Understanding deep image representations by inverting them

Abstract

Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of them remains limited. In this paper we conduct a direct analysis of the visual information contained in representations by asking the following question: given an encoding of an image, to which extent is it possible to reconstruct the image itself? To answer this question we contribute a general framework to invert representations. We show that this method can invert representations such as HOG more accurately than recent alternatives while being applicable to CNNs too. We then use this technique to study the inverse of recent state-of-the-art CNN image representations for the first time. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.

Keywords

Convolutional neural networkArtificial intelligenceImage (mathematics)Computer scienceScale-invariant feature transformBag-of-words model in computer visionEncoding (memory)Pattern recognition (psychology)Computer visionVisualizationVisual WordImage retrieval

Affiliated Institutions

University of Oxford GB

Related Publications

Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks

Maxime Oquab , Léon Bottou , Ivan Laptev +1 more

Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success o...

2014 3151 citations

Designing deep networks for surface normal estimation

Xiaolong Wang , David F. Fouhey , Abhinav Gupta

In the past few years, convolutional neural nets (CNN) have shown incredible promise for learning visual representations. In this paper, we use CNNs for the task of predicting s...

2015 345 citations

HOGgles: Visualizing Object Detection Features

Carl Vondrick , Aditya Khosla , Tomasz Malisiewicz +1 more

We introduce algorithms to visualize feature spaces used by object detectors. The tools in this paper allow a human to put on 'HOG goggles' and perceive the visual world as a HO...

2013 284 citations

Conditional Random Fields as Recurrent Neural Networks

Shuai Zheng , Sadeep Jayasumana , Bernardino Romera‐Paredes +5 more

Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep lear...

2015 2381 citations

Image Style Transfer Using Convolutional Neural Networks

Leon A. Gatys , Alexander S. Ecker , Matthias Bethge

Rendering the semantic content of an image in different styles is a difficult image processing task. Arguably, a major limiting factor for previous approaches has been the lack ...

2016 5772 citations

Publication Info

Year: 2015
Type: article
Pages: 5188-5196
Citations: 1831
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Understanding deep image representations by inverting them

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1831

OpenAlex

Cite This

APA Style

                            
                                    Aravindh Mahendran, 
                                
                                    Andrea Vedaldi
                                
                            (2015). 
                            Understanding deep image representations by inverting them. 
                            
                            , 5188-5196.
                            https://doi.org/10.1109/cvpr.2015.7299155

Identifiers

DOI: 10.1109/cvpr.2015.7299155