Abstract
We propose a new approach for estimation of the positions of facial key points with three-level carefully designed convolutional networks. At each level, the outputs of multiple networks are fused for robust and accurate estimation. Thanks to the deep structures of convolutional networks, global high-level features are extracted over the whole face region at the initialization stage, which help to locate high accuracy key points. There are two folds of advantage for this. First, the texture context information over the entire face is utilized to locate each key point. Second, since the networks are trained to predict all the key points simultaneously, the geometric constraints among key points are implicitly encoded. The method therefore can avoid local minimum caused by ambiguity and data corruption in difficult image samples due to occlusions, large pose variations, and extreme lightings. The networks at the following two levels are trained to locally refine initial predictions and their inputs are limited to small regions around the initial predictions. Several network structures critical for accurate and robust facial point detection are investigated. Extensive experiments show that our approach outperforms state-of-the-art methods in both detection accuracy and reliability.
Keywords
Affiliated Institutions
Related Publications
RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild
Though tremendous strides have been made in uncontrolled face detection, accurate and efficient 2D face alignment and 3D face reconstruction in-the-wild remain an open challenge...
Object Detection With Deep Learning: A Review
Due to object detection's close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection ...
CosFace: Large Margin Cosine Loss for Deep Face Recognition
Face recognition has made extraordinary progress owing to the advancement of deep convolutional neural networks (CNNs). The central task of face recognition, including face veri...
A Deep Sum-Product Architecture for Robust Facial Attributes Analysis
Recent works have shown that facial attributes are useful in a number of applications such as face recognition and retrieval. However, estimating attributes in images with large...
Cascaded Partial Decoder for Fast and Accurate Salient Object Detection
Existing state-of-the-art salient object detection networks rely on aggregating multi-level features of pre-trained convolutional neural networks (CNNs). However, compared to hi...
Publication Info
- Year
- 2013
- Type
- article
- Citations
- 1427
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/cvpr.2013.446