Syed Fahim ShahXiaoliang MengMuhammad Farooq HussainXinkan Hu
Aerial salient object detection faces unique challenges including extreme scale variations, cluttered backgrounds, and arbitrary object orientations, leading to the small-target dilemma and background ambiguity. To address these issues, we propose HAWKSight: a hierarchical attention-driven weighted kernel network. HAWKSight integrates three key innovations: First, a multiscale representation scheme combining a ConvNeXt-Tiny backbone with hierarchical feature compression and an enhanced pyramid pooling module that extends beyond standard PSPNet by incorporating full-resolution residual connections, preserving detail from tiny objects to large structures while maintaining boundary integrity. Second, a dual-attention Swin transformer fusion that introduces a structured guidance-refinement paradigm, combining shifted and regular-window attention in a novel chained configuration to capture global context efficiently while enabling cross-scale feature enrichment. Third, a boundary-sensitive U-Net decoder with squeeze-and-excitation blocks and a Gaussian-weighted contour loss for precise edge delineation. Evaluated on ORS-4199, EORSSD, and ORSSD benchmarks, HAWKSight achieves state-of-the-art performance with S-measure scores of 0.8885, 0.9203, and 0.9345, respectively, while maintaining real-time inference speeds of 105.87, 92.70, and 98.05 FPS. Our model demonstrates exceptional boundary precision (E-measure: 0.9451, 0.9611, 0.9751) and significantly reduces mean absolute error compared to existing methods, effectively resolving the speed-accuracy tradeoff in aerial SOD applications.
Chenxing XiaYanguang SunKuan‐Ching LiBin GeHanling ZhangBo JiangJi Zhang
Chenxing XiaYanguang SunKuan‐Ching LiBin GeHanling ZhangBo JiangJi Zhang
Sanping ZhouJinjun WangJimuyang ZhangLe WangDong HuangShaoyi DuNanning Zheng
Feng ZhouHui ShuaiQingshan LiuGuodong Guo
Yu WangWenjie LiHaowei WenZhiyang YuYi XueHua Li