A Spatial–Frequency Combined Transformer for Cloud Removal of Optical Remote Sensing Images

Feng Zhao; Chao Ding; Xin Li; Runliang Xia; Caifeng Wu; Xin Lyu

doi:10.3390/rs17091499

ScienceGate Book Chapters

JOURNAL ARTICLE

A Spatial–Frequency Combined Transformer for Cloud Removal of Optical Remote Sensing Images

Feng Zhao Chao Ding Xin Li Runliang Xia Caifeng Wu Xin Lyu

Year: 2025 Journal: Remote Sensing Vol: 17 (9)Pages: 1499-1499 Publisher: Multidisciplinary Digital Publishing Institute

DOI: 10.3390/rs17091499

Get Full-Text PDF Get Analytical Report

Abstract

Cloud removal is a vital preprocessing step in optical remote sensing images (RSIs), directly enhancing image quality and providing a high-quality data foundation for downstream tasks, such as water body extraction and land cover classification. Existing methods attempt to combine spatial and frequency features for cloud removal, but they rely on shallow feature concatenation or simplistic addition operations, which fail to establish effective cross-domain synergistic mechanisms. These approaches lead to edge blurring and noticeable color distortions. To address this issue, we propose a spatial–frequency collaborative enhancement Transformer network named SFCRFormer, which significantly improves cloud removal performance. The core of SFCRFormer is the spatial–frequency combined Transformer (SFCT) block, which implements cross-domain feature reinforcement through a dual-branch spatial attention (DBSA) module and frequency self-attention (FreSA) module to effectively capture global context information. The DBSA module enhances the representation of spatial features by decoupling spatial-channel dependencies via parallelized feature refinement paths, surpassing the performance of traditional single-branch attention mechanisms in maintaining the overall structure of the image. FreSA leverages fast Fourier transform to convert features into the frequency domain, using frequency differences between object and cloud regions to achieve precise cloud detection and fine-grained removal. In order to further enhance the features extracted by DBSA and FreSA, we design the dual-domain feed-forward network (DDFFN), which effectively improves the detail fidelity of the restored image by multi-scale convolution for local refinement and frequency transformation for global structural optimization. A composite loss function, incorporating Charbonnier loss and Structural Similarity Index (SSIM) loss, is employed to optimize model training and balance pixel-level accuracy with structural fidelity. Experimental evaluations on the public datasets demonstrate that SFCRFormer outperforms state-of-the-art methods across various quantitative metrics, including PSNR and SSIM, while delivering superior visual results.

Keywords:

Remote sensing Cloud computing Transformer Computer science Geology Electrical engineering Engineering

Metrics

Cited By

3.52

FWCI (Field Weighted Citation Impact)

Refs

0.83

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Image Fusion Techniques

Physical Sciences → Engineering → Media Technology

Remote Sensing in Agriculture

Physical Sciences → Environmental Science → Ecology

Remote-Sensing Image Classification

Physical Sciences → Engineering → Media Technology

A Spatial–Frequency Combined Transformer for Cloud Removal of Optical Remote Sensing Images

Abstract

Metrics

Citation History

Topics

Related Documents

Cloud removal from optical remote sensing images

Cloud Meets Diffusion: Progressive Cloud Removal for Optical Remote Sensing Images

Automated Cloud Removal and Filling in Optical Remote Sensing Images

Thin Cloud Removal Generative Adversarial Network Based on Sparse Transformer in Remote Sensing Images

Multi-Stage Frequency Attention Network for Progressive Optical Remote Sensing Cloud Removal