Xiang PanZhihao ShiHerong ZhengLi Qiuyu
Abstract Medical image segmentation faces significant challenges in cross-domain scenarios due to variations in imaging protocols and device-specific artifacts. While existing methods leverage either spatial-domain features or global frequency transforms (e.g., DCT, FFT), they often fail to effectively integrate multi-scale structural cues with localized frequency signatures, leading to degraded performance under domain shifts. To address this limitation, we propose a novel framework that unifies spatial and wavelet-frequency representations through wavelet-guided fusion. Our approach introduces two key innovations: (1) a wavelet-guided multi-scale attention mechanism that decomposes features into directional subbands to capture domain-invariant structural patterns, and (2) an adaptive lateral fusion strategy that dynamically aligns frequency-refined decoder features with spatially enhanced skip connections. By leveraging the inherent localization and directional sensitivity of wavelet transforms, our method achieves superior preservation of anatomical boundaries across domains. Comprehensive evaluations on dermoscopy, ultrasound, and microscopy datasets demonstrate state-of-the-art performance across both seen and unseen domains. Compared to previous methods, WGSF-Net improves the Dice score by up to 1.5% on dermoscopy, 2.0% on ultrasound, and 13.9% on microscopy in unseen settings. These results validate that wavelet-guided spatial-frequency fusion effectively enhances generalization in 2D medical image segmentation.
Zhiyi YangZhou ZhaoYuliang GuYongchao Xu
Liang ZhuKuan ShenGuangwen WangYu‐Jie HaoLijun ZhengYanping Lu
Zhanpeng LiuYuqiang ZhangBin WangYang YangCai Lin
M. WangH ChenY. X. LiJia SunL. ChenPeng Liang
Zhiyong TanRuifen CaoPijing WeiChao ZhouYansen SuChunhou Zheng