Abstract

Feature extraction, deformation handling, occlusion handling, and classification are four important components in pedestrian detection. Existing methods learn or design these components either individually or sequentially. The interaction among these components is not yet well explored. This paper proposes that they should be jointly learned in order to maximize their strengths through cooperation. We formulate these four components into a joint deep learning framework and propose a new deep network architecture. By establishing automatic, mutual interaction among components, the deep model achieves a 9% reduction in the average miss rate compared with the current best-performing pedestrian detection approaches on the largest Caltech benchmark dataset.

Keywords

Pedestrian detectionComputer scienceDeep learningBenchmark (surveying)Artificial intelligenceJoint (building)PedestrianFeature extractionMachine learningPattern recognition (psychology)Data miningEngineering

Affiliated Institutions

Related Publications

Publication Info

Year
2013
Type
article
Citations
681
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

681
OpenAlex

Cite This

Wanli Ouyang, Xiaogang Wang (2013). Joint Deep Learning for Pedestrian Detection. . https://doi.org/10.1109/iccv.2013.257

Identifiers

DOI
10.1109/iccv.2013.257