Abstract

We propose a new approach for estimation of the positions of facial key points with three-level carefully designed convolutional networks. At each level, the outputs of multiple networks are fused for robust and accurate estimation. Thanks to the deep structures of convolutional networks, global high-level features are extracted over the whole face region at the initialization stage, which help to locate high accuracy key points. There are two folds of advantage for this. First, the texture context information over the entire face is utilized to locate each key point. Second, since the networks are trained to predict all the key points simultaneously, the geometric constraints among key points are implicitly encoded. The method therefore can avoid local minimum caused by ambiguity and data corruption in difficult image samples due to occlusions, large pose variations, and extreme lightings. The networks at the following two levels are trained to locally refine initial predictions and their inputs are limited to small regions around the initial predictions. Several network structures critical for accurate and robust facial point detection are investigated. Extensive experiments show that our approach outperforms state-of-the-art methods in both detection accuracy and reliability.

Keywords

Computer scienceArtificial intelligenceInitializationKey (lock)Convolutional neural networkFace (sociological concept)Pattern recognition (psychology)Context (archaeology)Point cloudReliability (semiconductor)Deep learningComputer vision

Affiliated Institutions

Related Publications

Publication Info

Year
2013
Type
article
Citations
1427
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1427
OpenAlex

Cite This

Yi Sun, Xiaogang Wang, Xiaoou Tang (2013). Deep Convolutional Network Cascade for Facial Point Detection. . https://doi.org/10.1109/cvpr.2013.446

Identifiers

DOI
10.1109/cvpr.2013.446