Pansharpening fuses lower-resolution multispectral (LRMS) images with high-resolution panchromatic (PAN) images to generate high-resolution multispectral (HRMS) images that preserves both spatial and spectral information. Most deep pansharpening methods face challenges in cross-modal feature extraction and fusion, as well as in exploring the similarities between the fused image and both PAN and LRMS images. In this paper, we propose a spatial-spectral similarity-guided fusion network (S3FNet) for pansharpening. This architecture is composed of three parts. Specifically, a shallow feature extraction layer learns initial spatial, spectral and fused features from PAN and LRMS images. Then, a multi-branch asymmetric encoder, consisting of spatial, spectral and fusion branches, generates corresponding high-level features at different scales. A multi-scale reconstruction decoder, equipped with a well-designed cross-feature multi-head attention fusion block, processes the intermediate feature maps to generate HRMS images. To ensure HRMS images retain maximum spatial and spectral information, a similarity-constrained loss is defined for network training. Extensive experiments demonstrate the effectiveness of our S3FNet over state-of-the-art methods. The code is released at https://github.com/ZhangYongshan/S3FNet.
Jiamei XiongYongshan ZhangXinxin WangLefei Zhang
Yong YangM. J. LiShuying HuangHangyuan LuWei TuWeiguo Wan
Xu ShenShengwei ZhongHui LiChen Gong
Shuyin ZhangLaituan QiaoFan ZhangChao XuShuqi ZhaoQuanwei Gao
Kai ZhangAnfei WangFeng ZhangWenxiu DiaoJiande SunLorenzo Bruzzone