Yuechao BianHao TengQiurong LvJie ZhouWei ZhouTao LinRui Zhou
Abstract To tackle the difficulties of detecting small targets, uneven target scale distribution, and occlusion in complex backgrounds of drone-captured images, we propose an adaptive multi-scale and occlusion-aware object detection algorithm for aerial scenes. First, we create a feature fusion module called ATPC-C2f, which incorporates partial convolution (PConv) along with triplet attention (TA) mechanisms. This module improves the spatial perception awareness of deep networks by fusing channel and spatial dimension information. Next, we introduce the Adaptively Scaled Dual Feature Network (ASDFN), which reduces the loss of small target information during network propagation, thus balancing the importance of features at different scales. Additionally, we incorporate a Haar wavelet downsampling (HWD) module, which utilizes lossless feature encoding and feature learning blocks to preserve small target feature information while filtering redundant data. Finally, we propose the UBHead detection head, which employs multi-scale spatial propagation to fully exploit contextual information, enhancing the model’s global perception of feature maps and improving detection performance in partially occluded scenarios. Experimental results on the VisDrone2019-DET drone image detection benchmark dataset show that the YOLO-AAHU algorithm outperforms the YOLOv8n algorithm, achieving enhancements of 1.2% in mAP50 and 2.1% in mAP50:95. This comes with a minimal increase in both parameter count, which rises by 7 × 10 5 , and computational complexity rises by 1.1G. This demonstrates a favorable balance between detection performance and resource consumption, providing a reliable solution for drone-based visual detection in complex scenes.
Yuheng SunZhenping LanYanguo SunYuepeng GuoXinxin LiYuru WangYuwei Meng
Xiao HeXiaomei XieKai ZhaoPeilong SongXie Wen-biaoChen Zou
Dike ChenZhiyong QinJi ZhangHongyuan Wang
Qiusheng HeXiuyan ShaoWei ChenXiaoyun LiYang XiaoTongfeng Sun