JOURNAL ARTICLE

HyFormer: Hybrid Grouping-Aggregation Transformer and Wide-Spanning CNN for Hyperspectral Image Super-Resolution

Yantao JiJingang ShiYaping ZhangHaokun YangYuan ZongXu Ling

Year: 2023 Journal:   Remote Sensing Vol: 15 (17)Pages: 4131-4131   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Hyperspectral image (HSI) super-resolution is a practical and challenging task as it requires the reconstruction of a large number of spectral bands. Achieving excellent reconstruction results can greatly benefit subsequent downstream tasks. The current mainstream hyperspectral super-resolution methods mainly utilize 3D convolutional neural networks (3D CNN) for design. However, the commonly used small kernel size in 3D CNN limits the model’s receptive field, preventing it from considering a wider range of contextual information. Though the receptive field could be expanded by enlarging the kernel size, it results in a dramatic increase in model parameters. Furthermore, the popular vision transformers designed for natural images are not suitable for processing HSI. This is because HSI exhibits sparsity in the spatial domain, which can lead to significant computational resource waste when using self-attention. In this paper, we design a hybrid architecture called HyFormer, which combines the strengths of CNN and transformer for hyperspectral super-resolution. The transformer branch enables intra-spectra interaction to capture fine-grained contextual details at each specific wavelength. Meanwhile, the CNN branch facilitates efficient inter-spectra feature extraction among different wavelengths while maintaining a large receptive field. Specifically, in the transformer branch, we propose a novel Grouping-Aggregation transformer (GAT), comprising grouping self-attention (GSA) and aggregation self-attention (ASA). The GSA is employed to extract diverse fine-grained features of targets, while the ASA facilitates interaction among heterogeneous textures allocated to different channels. In the CNN branch, we propose a Wide-Spanning Separable 3D Attention (WSSA) to enlarge the receptive field while keeping a low parameter number. Building upon WSSA, we construct a wide-spanning CNN module to efficiently extract inter-spectra features. Extensive experiments demonstrate the superior performance of our HyFormer.

Keywords:
Computer science Hyperspectral imaging Artificial intelligence Pattern recognition (psychology) Transformer Convolutional neural network Receptive field Kernel (algebra) Computer vision Voltage Mathematics

Metrics

2
Cited By
0.43
FWCI (Field Weighted Citation Impact)
50
Refs
0.62
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image Fusion Techniques
Physical Sciences →  Engineering →  Media Technology
Advanced Image Processing Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image and Signal Denoising Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.