Abstract

The volume of convolutional neural network (CNN) models proposed for face recognition has been continuously growing larger to better fit the large amount of training data. When training data are obtained from the Internet, the labels are likely to be ambiguous and inaccurate. This paper presents a Light CNN framework to learn a compact embedding on the large-scale face data with massive noisy labels. First, we introduce a variation of maxout activation, called max-feature-map (MFM), into each convolutional layer of CNN. Different from maxout activation that uses many feature maps to linearly approximate an arbitrary convex activation function, MFM does so via a competitive relationship. MFM can not only separate noisy and informative signals but also play the role of feature selection between two feature maps. Second, three networks are carefully designed to obtain better performance, meanwhile, reducing the number of parameters and computational costs. Finally, a semantic bootstrapping method is proposed to make the prediction of the networks more consistent with noisy labels. Experimental results show that the proposed framework can utilize large-scale noisy data to learn a Light model that is efficient in computational costs and storage spaces. The learned single network with a 256-D representation achieves state-of-the-art results on various face benchmarks without fine-tuning.

Keywords

Computer scienceConvolutional neural networkArtificial intelligenceEmbeddingPattern recognition (psychology)Feature (linguistics)Facial recognition systemRepresentation (politics)Face (sociological concept)Deep learningActivation functionMachine learningArtificial neural network

Affiliated Institutions

Related Publications

Network In Network

Abstract: We propose a novel deep network structure called In Network (NIN) to enhance model discriminability for local patches within the receptive field. The conventional con...

2014 arXiv (Cornell University) 1037 citations

Publication Info

Year
2018
Type
article
Volume
13
Issue
11
Pages
2884-2896
Citations
1121
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1121
OpenAlex

Cite This

Xiang Wu, Ran He, Zhenan Sun et al. (2018). A Light CNN for Deep Face Representation With Noisy Labels. IEEE Transactions on Information Forensics and Security , 13 (11) , 2884-2896. https://doi.org/10.1109/tifs.2018.2833032

Identifiers

DOI
10.1109/tifs.2018.2833032