Existing algorithms for unmanned aerial vehicle (UAV) image object detection often face challenges such as low detection accuracy for small objects and missed detections of multi-scale objects. To address these issues, this paper proposes a UAV image object detection algorithm that integrates a channel attention mechanism with parallel-structured dilated convolution feature fusion. To enhance the algorithms feature representation capabilities in terms of channel attention and receptive field, the ResNet50 backbone is redesigned by incorporating the Squeeze-and-Excitation Network (SENet) and a Parallel-Structured Dilated Convolution Feature Fusion Network (PSDCFFN). Additionally, Region of Interest (ROI) Align is employed, and the Region Proposal Network (RPN) anchor sizes are optimized using K-Means clustering to minimize coordinate deviations during object regression. Experimental results demonstrate that the proposed algorithm significantly improves object detection accuracy in UAV images. On the RSOD-Dataset and a custom UAV image dataset, the mean Average Precision (mAP) reaches 92.52% and 98.07%, respectively.
Yuanzhu LiuZhiming DingYang CaoMengmeng Chang
Juan YanZhijun FangYongbin Gao
Tao WangZenghui DingXianjun YangYanyan ChenYu LiuXiaoming KongYining Sun
Hai-Sheng LiRongrong YuanQian LiCong Hu