JOURNAL ARTICLE

An efficient object tracking based on multi‐head cross‐attention transformer

Abstract

Abstract Object tracking is an essential component of computer vision and plays a significant role in various practical applications. Recently, transformer‐based trackers have become the predominant method for tracking due to their robustness and efficiency. However, existing transformer‐based trackers typically focus solely on the template features, neglecting the interactions between the search features and the template features during the tracking process. To address this issue, this article introduces a multi‐head cross‐attention transformer for visual tracking (MCTT), which effectively enhance the interaction between the template branch and the search branch, enabling the tracker to prioritize discriminative feature. Additionally, an auxiliary segmentation mask head has been designed to produce a pixel‐level feature representation, enhancing and tracking accuracy by predicting a set of binary masks. Comprehensive experiments have been performed on benchmark datasets, such as LaSOT, GOT‐10k, UAV123 and TrackingNet using various advanced methods, demonstrating that our approach achieves promising tracking performance. MCTT achieves an AO score of 72.8 on the GOT‐10k.

Keywords:
Computer science Transformer Artificial intelligence Computer vision Video tracking Tracking (education) Head (geology) Object (grammar) Electrical engineering Voltage

Metrics

1
Cited By
0.53
FWCI (Field Weighted Citation Impact)
36
Refs
0.53
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Fire Detection and Safety Systems
Physical Sciences →  Engineering →  Safety, Risk, Reliability and Quality

Related Documents

JOURNAL ARTICLE

Improving the Execution Speed of Transformer-based Object Tracking Models through Multi-head Attention Parallelization

Inmo KimMyung‐Sun Kim

Journal:   Journal of the Institute of Electronics and Information Engineers Year: 2023 Vol: 60 (4)Pages: 39-47
JOURNAL ARTICLE

Multi-Head-Self-Attention based YOLOv5X-transformer for multi-scale object detection

Ponduri VasanthiLaavanya Mohan

Journal:   Multimedia Tools and Applications Year: 2023 Vol: 83 (12)Pages: 36491-36517
JOURNAL ARTICLE

SMSTracker: A Self-Calibration Multi-Head Self-Attention Transformer for Visual Object Tracking

Zhongyang WangHu ZhuFeng Liu

Journal:   Computers, materials & continua/Computers, materials & continua (Print) Year: 2024 Vol: 80 (1)Pages: 605-623
© 2026 ScienceGate Book Chapters — All rights reserved.