Exploring Simple Siamese Representation Learning

Xinlei Chen; Kaiming He

doi:10.1109/cvpr46437.2021.01549

Abstract

Siamese networks have become a common structure in various recent models for unsupervised visual representation learning. These models maximize the similarity between two augmentations of one image, subject to certain conditions for avoiding collapsing solutions. In this paper, we report surprising empirical results that simple Siamese networks can learn meaningful representations even using none of the following: (i) negative sample pairs, (ii) large batches, (iii) momentum encoders. Our experiments show that collapsing solutions do exist for the loss and structure, but a stop-gradient operation plays an essential role in preventing collapsing. We provide a hypothesis on the implication of stop-gradient, and further show proof-of-concept experiments verifying it. Our "SimSiam" method achieves competitive results on ImageNet and downstream tasks. We hope this simple baseline will motivate people to rethink the roles of Siamese architectures for unsupervised representation learning. Code is made available. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>

Keywords

Simple (philosophy)Representation (politics)Computer scienceEncoderArtificial intelligenceSimilarity (geometry)Feature learningCode (set theory)Machine learningSubject (documents)Deep learningTheoretical computer scienceNatural language processingImage (mathematics)Programming languageEpistemology

Affiliated Institutions

Meta (Israel) IL

Related Publications

Self-organisation: a derivation from first principles of a class of learning algorithms

S.P. Luttrell

A novel derivation of T. Kohonen's topographic mapping learning algorithm (Self-Organization and Associative Memory, Springer-Verlag, 1984) is presented. Thus the author prescri...

1989 International Joint Conference on Neu... 59 citations

Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set

Shunping Ji , Shiqing Wei , Meng Lü

The application of the convolutional neural network has shown to greatly improve the accuracy of building extraction from remote sensing imagery. In this paper, we created and m...

2018 IEEE Transactions on Geoscience and R... 1575 citations

Probabilistic visual learning for object detection

B. Moghaddam , Alex Pentland

We present an unsupervised technique for visual learning which is based on density estimation in high-dimensional spaces using an eigenspace decomposition. Two types of density ...

2002 360 citations

Using locally weighted regression for robot learning

Christopher G. Atkeson

The use of locally weighted regression in memory-based robot learning is explored. A local model is formed to answer each query, using a weighted regression in which close point...

2002 108 citations

Exploring Strategies for Training Deep Neural Networks

Hugo Larochelle , Yoshua Bengio , Jérôme Louradour +1 more

Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently...

2009 Journal of Machine Learning Research 1114 citations

Publication Info

Year: 2021
Type: article
Citations: 3039
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Exploring Simple Siamese Representation Learning

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

3039

OpenAlex

Cite This

APA Style

                            
                                    Xinlei Chen, 
                                
                                    Kaiming He
                                
                            (2021). 
                            Exploring Simple Siamese Representation Learning. 
                            
                            .
                            https://doi.org/10.1109/cvpr46437.2021.01549

Identifiers

DOI: 10.1109/cvpr46437.2021.01549