Fan WangJie JinXiao ChenChunyuan WangN Neha
Abstract Small object detection in aerial remote sensing images remains a challenging task due to low resolution, dense object distribution, and complex backgrounds. In this paper, we enhance the YOLOv10 architecture by introducing a lightweight framework that combines multi-scale feature extraction in the spatial domain with high-frequency enhancement in the frequency domain to improve the extraction of fine details. The approach further incorporates an entropy-guided mechanism to strengthen foreground discrimination, a statistically constrained loss function to suppress background interference, and a shared detection head to reduce parameter redundancy and maintain scale consistency. Experiments on the VisDrone dataset show that the proposed method achieves improvements of 3.1% in [email protected] and 3.5% in [email protected]:0.95 over strong baselines, while keeping computational overhead low. Evaluation on the NWPU VHR-10 dataset confirms the model’s robustness and generalization across varied remote sensing scenarios. These results demonstrate the effectiveness of the proposed method for accurate and real-time small object detection in complex aerial imagery.
Jingxin BiXiangyue ZhengKeda LiHaiyang ZhangGang ZhangTao Lei
Bin YaoChengkun ZhangQingxiang MengXiaoliang SunXuyang HuLu WangXilai Li
Guangxia LiuJianglei DiZhenbo Ren
Yin ZhangMu YeGuiyi ZhuYong LiuPengyu GuoJunhua Yan