Discriminative Local Representation Learning for Cross-Modality Visible-Thermal Person Re-Identification

Yong Wu; Guo-Dui He; Lihua Wen; Xiao Qin; Changan Yuan; Valeriya Gribova; Vladimir Filaretov; De-Shuang Huang

doi:10.1109/tbiom.2022.3184525

ScienceGate Book Chapters

JOURNAL ARTICLE

Discriminative Local Representation Learning for Cross-Modality Visible-Thermal Person Re-Identification

Yong Wu Guo-Dui He Lihua Wen Xiao Qin Changan Yuan Valeriya Gribova Vladimir Filaretov De-Shuang Huang

Year: 2022 Journal: IEEE Transactions on Biometrics Behavior and Identity Science Vol: 5 (1)Pages: 1-14 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tbiom.2022.3184525

Get Full-Text PDF Get Analytical Report

Abstract

Visible-thermal person re-identification (VTReID) is a rising and challenging cross-modality retrieval task in intelligent video surveillance systems. Most attention architectures cannot explore the discriminative person representations for VTReID, especially in the thermal modality. In addition, the fine-grained middle-level semantic information has received much less attention in the part-based approaches for the cross-modality pedestrian retrieval task, resulting in limited generalization capability and poor representation robustness. This paper proposes a simple yet powerful discriminative local representation learning (DLRL) model to capture the robust local fine-grained feature representations and explore the rich semantic relationship between the learned part features. Specifically, an efficient contextual attention aggregation module (CAAM) is designed to strengthen the discriminative capability of the feature representations and explore the contextual cues for visible and thermal modalities. Then, an integrated middle-high feature learning (IMHF) method is introduced to capture the part-level salient representations, which handles the ambiguous modality discrepancy in both discriminative middle-level and robust high-level information. Moreover, a part-guided graph convolution module (PGCM) is constructed to mine the structural relationship among the part representations within each modality. The quantitative and qualitative experiments on the two benchmark datasets demonstrate that the proposed DLRL model significantly outperforms state-of-the-art methods and achieves rank-1/mAP accuracy of 92.77%/82.05% on the RegDB dataset and 63.04%/60.58% on the SYSU-MM01 dataset.

Keywords:

Discriminative model Computer science Artificial intelligence Feature learning Pattern recognition (psychology) Modality (human–computer interaction) Feature (linguistics) Robustness (evolution) Salient Machine learning Representation (politics) Graph Natural language processing

Metrics

Cited By

2.72

FWCI (Field Weighted Citation Impact)

Refs

0.89

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Discriminative Local Representation Learning for Cross-Modality Visible-Thermal Person Re-Identification

Abstract

Metrics

Citation History

Topics

Related Documents

Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification

Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification

Keypoint-Guided Modality-Invariant Discriminative Learning for Visible-Infrared Person Re-identification

Modality-aware Collaborative Learning for Visible Thermal Person Re-Identification

Cross-modality consistency learning for visible-infrared person re-identification