Abstract

Visual appearance score, appearance mixture type and deformation are three important information sources for human pose estimation. This paper proposes to build a multi-source deep model in order to extract non-linear representation from these different aspects of information sources. With the deep model, the global, high-order human body articulation patterns in these information sources are extracted for pose estimation. The task for estimating body locations and the task for human detection are jointly learned using a unified deep model. The proposed approach can be viewed as a post-processing of pose estimation results and can flexibly integrate with existing methods by taking their information sources as input. By extracting the non-linear representation from multiple information sources, the deep model outperforms state-of-the-art by up to 8.6 percent on three public benchmark datasets.

Keywords

Benchmark (surveying)Computer scienceArtificial intelligencePoseRepresentation (politics)Deep learningTask (project management)Machine learningPattern recognition (psychology)Task analysisEstimation

Affiliated Institutions

Related Publications

Publication Info

Year
2014
Type
article
Pages
2337-2344
Citations
273
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

273
OpenAlex

Cite This

Wanli Ouyang, Xiao Chu, Xiaogang Wang (2014). Multi-source Deep Learning for Human Pose Estimation. , 2337-2344. https://doi.org/10.1109/cvpr.2014.299

Identifiers

DOI
10.1109/cvpr.2014.299