JOURNAL ARTICLE

Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention

Peng GaoXin-Yue ZHANGXiaoli YangJiancheng NiFei Wang

Year: 2023 Journal:   IEICE Transactions on Information and Systems Vol: E107.D (1)Pages: 161-164   Publisher: Institute of Electronics, Information and Communication Engineers

Abstract

Despite Siamese trackers attracting much attention due to their scalability and efficiency in recent years, researchers have ignored the background appearance, which leads to their inapplicability in recognizing arbitrary target objects with various variations, especially in complex scenarios with background clutter and distractors. In this paper, we present a simple yet effective Siamese tracker, where the shifted windows multi-head self-attention is produced to learn the characteristics of a specific given target object for visual tracking. To validate the effectiveness of our proposed tracker, we use the Swin Transformer as the backbone network and introduced an auxiliary feature enhancement network. Extensive experimental results on two evaluation datasets demonstrate that the proposed tracker outperforms other baselines.

Keywords:
Computer science BitTorrent tracker Clutter Artificial intelligence Computer vision Eye tracking Transformer Scalability Feature (linguistics) Tracking (education) Pattern recognition (psychology) Radar

Metrics

1
Cited By
0.18
FWCI (Field Weighted Citation Impact)
17
Refs
0.47
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Infrared Target Detection Methodologies
Physical Sciences →  Engineering →  Aerospace Engineering
Gaze Tracking and Assistive Technology
Physical Sciences →  Computer Science →  Human-Computer Interaction
© 2026 ScienceGate Book Chapters — All rights reserved.