In this paper, we propose novel deep correlation filters (DCFs) model, which is an end-to-end deep convolutional network that simultaneously learns convolutional features, multiple diverse correlation filters, and scale estimation for robust visual tracking. Compared with existing tracking methods based on correlation filters and deep learning, the proposed DCF enjoys several merits. First, it exploits the spatial transformer into the network to discover the image patches that are most relevant to target object. As a result, it can effectively handle scale variation. Second, the spatial transformer and correlation filters can complement and enhance each other for object translation estimation, resulting in robust tracking performance. Third, to the best of our knowledge, the DCF is the first correlation filter based deep tracker achieving object translation and scale estimation jointly. Extensive results on four challenging benchmark datasets demonstrate that the proposed tracking algorithm performs favorably against state-of-the-art trackers.
Huaiyi ZhangLeye ZhangYan LiYuqi He
Alan LukežičLuka Čehovin ZajcMatej Kristan
Jie ZhangYang LiuYalian WuShujuan TianQingyong Deng
Jiawen LiaoChun QiJianzhong CaoXiaofang WangLong RenChaoning Zhang
Min JiangJianyu ShenJun KongBenxuan Wang