Abstract

Existing person re-identification (re-id) methods either assume the availability of well-aligned person bounding box images as model input or rely on constrained attention selection mechanisms to calibrate misaligned images. They are therefore sub-optimal for re-id matching in arbitrarily aligned person images potentially with large human pose variations and unconstrained auto-detection errors. In this work, we show the advantages of jointly learning attention selection and feature representation in a Convolutional Neural Network (CNN) by maximising the complementary information of different levels of visual attention subject to re-id discriminative learning constraints. Specifically, we formulate a novel Harmonious Attention CNN (HA-CNN) model for joint learning of soft pixel attention and hard regional attention along with simultaneous optimisation of feature representations, dedicated to optimise person re-id in uncontrolled (misaligned) images. Extensive comparative evaluations validate the superiority of this new HA-CNN model for person re-id over a wide variety of state-of-the-art methods on three large-scale benchmarks including CUHK03, Market-1501, and DukeMTMC-ReID.

Keywords

Computer scienceDiscriminative modelArtificial intelligenceConvolutional neural networkFeature learningMinimum bounding boxFeature (linguistics)Bounding overwatchMatching (statistics)Identification (biology)Machine learningPattern recognition (psychology)Feature extractionRepresentation (politics)Attention networkComputer visionImage (mathematics)

Affiliated Institutions

Related Publications

Publication Info

Year
2018
Type
preprint
Citations
1377
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1377
OpenAlex

Cite This

Wei Li, Xiatian Zhu, Shaogang Gong (2018). Harmonious Attention Network for Person Re-identification. . https://doi.org/10.1109/cvpr.2018.00243

Identifiers

DOI
10.1109/cvpr.2018.00243