RGB-T tracking aims to capture both position and scale of a specific target, with the guidance of both visible and thermal images. With the popularity of multi-modal sensors, this topic has reached more and more attention, which has great potential on autonomous driving, smart monitoring, etc. Recent methods mainly introduce feature fusion modules to aggregate multi-modality information via feature selection or feature fusion. In this paper, we design an adaptive multi-modal decision fusion strategy for Visible and Thermal (RGB-T) tracking. First, we set correlation filter tracker as our baseline. Then, we calculate the tracking confidence of both modalities via Peak-to-Side Ratio, generating the fusion weights. Finally, the response maps are linearly summed via the fusion weights. Experiments on GTOT validate the proposed fusion strategy can provide superior performance against the competitors, achieving 25.3% MSR and 43.9% MPR.
Jing JinJian‐Qin LiuFengwen ZHAI
Shenglan LiRui YaoYong ZhouHancheng ZhuBing LiuJiaqi ZhaoZhiwen Shao
He WangTianyang XuZhangyong TangXiao-Jun WuJosef Kittler
Xiao GuoHangfei LiYufei ZhaPeng Zhang