JOURNAL ARTICLE

Lightweight Spatial-Temporal Contextual Aggregation Siamese Network for Unmanned Aerial Vehicle Tracking

Qiqi ChenJinghong LiuFaxue LiuFang XuChenglong Liu

Year: 2024 Journal:   Drones Vol: 8 (1)Pages: 24-24   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Benefiting from the powerful feature extraction capability of deep learning, the Siamese tracker stands out due to its advanced tracking performance. However, constrained by the complex backgrounds of aerial tracking, such as low resolution, occlusion, similar objects, small objects, scale variation, aspect ratio change, deformation and limited computational resources, efficient and accurate aerial tracking is still difficult to realize. In this work, we design a lightweight and efficient adaptive temporal contextual aggregation Siamese network for aerial tracking, which is designed with a parallel atrous module (PAM) and adaptive temporal context aggregation model (ATCAM) to mitigate the above problems. Firstly, by using a series of atrous convolutions with different dilation rates in parallel, the PAM can simultaneously extract and aggregate multi-scale features with spatial contextual information at the same feature map, which effectively improves the ability to cope with changes in target appearance caused by challenges such as aspect ratio change, occlusion, scale variation, etc. Secondly, the ATCAM adaptively introduces temporal contextual information to the target frame through the encoder-decoder structure, which helps the tracker resist interference and recognize the target when it is difficult to extract high-resolution features such as low-resolution, similar objects. Finally, experiments on the UAV20L, UAV123@10fps and DTB70 benchmarks demonstrate the impressive performance of the proposed network running at a high speed of over 75.5 fps on the NVIDIA 3060Ti.

Keywords:
Computer science Artificial intelligence Computer vision Frame rate Context (archaeology) Tracking (education) Encoder Feature extraction Pattern recognition (psychology) Feature (linguistics) Deep learning Scale (ratio)

Metrics

3
Cited By
1.59
FWCI (Field Weighted Citation Impact)
53
Refs
0.72
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Infrared Target Detection Methodologies
Physical Sciences →  Engineering →  Aerospace Engineering
Fire Detection and Safety Systems
Physical Sciences →  Engineering →  Safety, Risk, Reliability and Quality

Related Documents

JOURNAL ARTICLE

RGB-T Tracking Algorithm for Unmanned Aerial Vehicles Based on Lightweight Siamese Network

哲宇 刘

Journal:   Modeling and Simulation Year: 2025 Vol: 14 (06)Pages: 99-109
JOURNAL ARTICLE

An Unmanned Aerial Vehicle Video Object Tracking Algorithm Based on Siamese Attention Network

Yuhuan ZhengDianwei WangPengfei HanXincheng RenZhijie Xu

Journal:   2021 4th International Conference on Artificial Intelligence and Pattern Recognition Year: 2021 Pages: 1-8
© 2026 ScienceGate Book Chapters — All rights reserved.