Xiao JiangXinyu WangChen SunZengliang ZhuYanfei Zhong
Hyperspectral object tracking (HOT) aims at tracking targets using the rich spectral information from hyperspectral video. Recently, dual Siamese network (DSN) has been proposed for HOT with advanced performances, via integrating a RGB Siamese branch with a hyperspectral Siamese branch to solve small sample challenge of hyperspectral modality. However, there are still challenges of DSN that reduce its practicality: a single DSN model is difficult to process hyperspectral videos with varied channels; the spatial features extracted by the pre-trained RGB branch plays a dominant role, while the hyperspectral features are not fully explored. To address the challenges, we propose a Channel AdapTive dual Siamese network, termed SiamCAT, for HOT with varied channels. Specifically, treating each frame of hyperspectral video as a grayscale image sequence varied with wavelengths, a channel adaptive module is introduced to encode the grayscale image sequence of different lengths into a uniform length, and so that SiamCAT can process hyperspectral video with varied channels. Meanwhile, a guided learning attention module is proposed to progressively learn spectral features of the tracked target highlighted by the spatial attention of the pre-trained RGB branch. Note that, to force spectral features play a leading role, instead of traditional features fusion, the spectral features extracted by the hyperspectral branch are utilized for confirming the target position. In the experiments, SiamCAT were verified by using the HOT competition dataset (i.e., 16-channel, 25-channel, and 15-channel hyperspectral videos with different wavelength ranges) and the WHU-Hi-H 3 dataset (25-channel hyperspectral videos), and achieved advanced performances.
Yongfeng FangWu YunBingyu SunChaoyuan Cui
Zhuanfeng LiFengchao XiongJianfeng LuJun ZhouYuntao Qian
Yijun TianHuiqian DuZhifeng Ma
Wenxing GaoXiaolin TianYifan ZhangNan JiaTing YangLicheng Jiao