Yujian FengFeng ChenJian YuYimu JiFei WuTianliang LiuShangdong LiuXiao‐Yuan JingJiebo Luo
Video-based visible-infrared person re-identification (VVI-ReID) aims to match the identity of a person captured in video sequences from both visible and infrared cameras. The VVI-ReID task requires considering both the spatial relationship between body parts within each frame and the temporal change of appearance between successive frames. Existing VVI Re-ID methods employ Convolutional Neural Networks to extract local spatial features and Long Short-Term Memory to form temporal associations. However, these methods can not effectively capture the global spatial feature and the long-range temporal dependencies in ultra-long sequences. In this paper, we propose a Cross-modality Spatial-temporal Transformer (CST) including a Cross-frame Tube Transformer Module (CTTM) and a Multi-frame Transformer Fusion Module (MTFM) to address these challenges. Firstly, CTTM tokenizes a video clip into multiple 3D tubes, each encapsulating local spatial-temporal information of pedestrians, and then obtains global spatial-temporal representations by establishing the relationship between tubes. Secondly, we design MTFM to exchange information between multiple frames using message tokens, thus modeling the long-range temporal dependencies of features of pedestrians. In addition, to prevent the potential representation collapse caused by triplet-based loss functions, we propose a diversity-consistency (DC) loss function to preserve the diversity and consistency of cross-modality feature representations by imposing variance, invariance, and covariance constraints in feature representations. Extensive benchmark experiments demonstrate that our approach outperforms the state-of-the-art methods with large margins.
Kongzhu JiangTianzhu ZhangXiang LiuBingqiao QianYongdong ZhangFeng Wu
Ranjit Kumar MishraArijit MondalJimson Mathew
Yujian FengJian YuFeng ChenYimu JiFei WuShangdong LiuXiao‐Yuan Jing
Tengfei LiangYi JinWu LiuYidong Li
Mingfu XiongJian-Jia LiangYifei GuoIk Hyun LeeSambit BakshiKhan Muhammad