JOURNAL ARTICLE

YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection

Xiaofeng ZhaoYuting XiaWenwen ZhangChao ZhengZhili Zhang

Year: 2023 Journal:   Remote Sensing Vol: 15 (15)Pages: 3778-3778   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

The detection of infrared vehicle targets by UAVs poses significant challenges in the presence of complex ground backgrounds, high target density, and a large proportion of small targets, which result in high false alarm rates. To alleviate these deficiencies, a novel YOLOv7-based, multi-scale target detection method for infrared vehicle targets is proposed, which is termed YOLO-ViT. Firstly, within the YOLOV7-based framework, the lightweight MobileViT network is incorporated as the feature extraction backbone network to fully extract the local and global features of the object and reduce the complexity of the model. Secondly, an innovative C3-PANet neural network structure is delicately designed, which adopts the CARAFE upsampling method to utilize the semantic information in the feature map and improve the model’s recognition accuracy of the target region. In conjunction with the C3 structure, the receptive field will be increased to enhance the network’s accuracy in recognizing small targets and model generalization ability. Finally, the K-means++ clustering method is utilized to optimize the anchor box size, leading to the design of anchor boxes better suited for detecting small infrared targets from UAVs, thereby improving detection efficiency. The present article showcases experimental findings attained through the use of the HIT-UAV public dataset. The results demonstrate that the enhanced YOLO-ViT approach, in comparison to the original method, achieves a reduction in the number of parameters by 49.9% and floating-point operations by 67.9%. Furthermore, the mean average precision (mAP) exhibits an improvement of 0.9% over the existing algorithm, reaching a value of 94.5%, which validates the effectiveness of the method for UAV infrared vehicle target detection.

Keywords:
Computer science Upsampling Artificial intelligence Point cloud Backbone network Feature (linguistics) Object detection Computer vision Cluster analysis Pattern recognition (psychology) Image (mathematics)

Metrics

60
Cited By
31.20
FWCI (Field Weighted Citation Impact)
42
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Infrared Target Detection Methodologies
Physical Sciences →  Engineering →  Aerospace Engineering
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.