Due to the modality gap between visible and infrared images with high visual ambiguity, learning diverse modality-shared semantic concepts for visible-infrared person re-identification (VI-ReID) remains a challenging problem. Body shape is one of the significant modality-shared cues for VI-ReID. To dig more diverse modality-shared cues, we expect that erasing body-shape-related semantic concepts in the learned features can force the ReID model to extract more and other modality-shared features for identification. To this end, we propose shape-erased feature learning paradigm that decorrelates modality-shared features in two orthogonal subspaces. Jointly learning shape-related feature in one subspace and shape-erased features in the orthogonal complement achieves a conditional mutual information maximization between shape-erased feature and identity discarding body shape information, thus enhancing the diversity of the learned representation explicitly. Extensive experiments on SYSU-MM01, RegDB, and HITSZ-VCM datasets demonstrate the effectiveness of our method.
Zhuxuan ChengZhijia ZhangHuijie FanWu Ding
Hao WangXiaojun BiChangdong Yu
Feng ZhouZhuxuan ChengHaitao YangYifeng SongShengpeng Fu
Kunfeng ChenZhisong PanJiabao WangShanshan JiaoZhicheng ZengZhuang Miao
Zhenzhen YangXinyi WuYongpeng Yang