Changyong GuoZhaoxin ZhangJinjiang LiXuesong JiangJun ZhangLei Zhang
In this article, we aim to improve the performance of visual tracking by combing different features of multiple modalities. The core idea is to use covariance matrices as feature descriptors and then use sparse coding to encode different features. The notion of sparsity has been successfully used in visual tracking. In this context, sparsity is used along appearance models often obtained from intensity/color information. In this work, we step outside this trend and propose to model the target appearance by local covariance descriptors (CovDs) in a pyramid structure. The proposed pyramid structure not only enables us to encode local and spatial information of the target appearance but also inherits useful properties of CovDs such as invariance to affine transforms. Since CovDs lie on a Riemannian manifold, we further propose to perform tracking through sparse coding by embedding the Riemannian manifold into an infinite-dimensional Hilbert space. Embedding the manifold into a Hilbert space allows us to perform sparse coding efficiently using the kernel trick. Our empirical study shows that the proposed tracking framework outperforms the existing state-of-the-art methods in challenging scenarios.
Bo MaHongwei HuShiqi LiuJianglong Chen
Risheng LiuJing WangXiaoke ShangYiyang WangZhixun SuYu Cai
Hongtu HuangBI Du-yanYufei ZhaShiping MaShan GaoChang Liu
Yun LiangDong WangYijin ChenLei XiaoCaixing Liu