HyFormer: Hybrid Grouping-Aggregation Transformer and Wide-Spanning CNN for Hyperspectral Image Super-Resolution

Yantao Ji; Jingang Shi; Yaping Zhang; Haokun Yang; Yuan Zong; Xu Ling

doi:10.3390/rs15174131

ScienceGate Book Chapters

JOURNAL ARTICLE

HyFormer: Hybrid Grouping-Aggregation Transformer and Wide-Spanning CNN for Hyperspectral Image Super-Resolution

Yantao Ji Jingang Shi Yaping Zhang Haokun Yang Yuan Zong Xu Ling

Year: 2023 Journal: Remote Sensing Vol: 15 (17)Pages: 4131-4131 Publisher: Multidisciplinary Digital Publishing Institute

DOI: 10.3390/rs15174131

Get Full-Text PDF Get Analytical Report

Abstract

Hyperspectral image (HSI) super-resolution is a practical and challenging task as it requires the reconstruction of a large number of spectral bands. Achieving excellent reconstruction results can greatly benefit subsequent downstream tasks. The current mainstream hyperspectral super-resolution methods mainly utilize 3D convolutional neural networks (3D CNN) for design. However, the commonly used small kernel size in 3D CNN limits the model’s receptive field, preventing it from considering a wider range of contextual information. Though the receptive field could be expanded by enlarging the kernel size, it results in a dramatic increase in model parameters. Furthermore, the popular vision transformers designed for natural images are not suitable for processing HSI. This is because HSI exhibits sparsity in the spatial domain, which can lead to significant computational resource waste when using self-attention. In this paper, we design a hybrid architecture called HyFormer, which combines the strengths of CNN and transformer for hyperspectral super-resolution. The transformer branch enables intra-spectra interaction to capture fine-grained contextual details at each specific wavelength. Meanwhile, the CNN branch facilitates efficient inter-spectra feature extraction among different wavelengths while maintaining a large receptive field. Specifically, in the transformer branch, we propose a novel Grouping-Aggregation transformer (GAT), comprising grouping self-attention (GSA) and aggregation self-attention (ASA). The GSA is employed to extract diverse fine-grained features of targets, while the ASA facilitates interaction among heterogeneous textures allocated to different channels. In the CNN branch, we propose a Wide-Spanning Separable 3D Attention (WSSA) to enlarge the receptive field while keeping a low parameter number. Building upon WSSA, we construct a wide-spanning CNN module to efficiently extract inter-spectra features. Extensive experiments demonstrate the superior performance of our HyFormer.

Keywords:

Computer science Hyperspectral imaging Artificial intelligence Pattern recognition (psychology) Transformer Convolutional neural network Receptive field Kernel (algebra) Computer vision Voltage Mathematics

Metrics

Cited By

0.43

FWCI (Field Weighted Citation Impact)

Refs

0.62

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Image Fusion Techniques

Physical Sciences → Engineering → Media Technology

Advanced Image Processing Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image and Signal Denoising Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

HyFormer: Hybrid Grouping-Aggregation Transformer and Wide-Spanning CNN for Hyperspectral Image Super-Resolution

Abstract

Metrics

Citation History

Topics

Related Documents

SSAformer: Spatial–Spectral Aggregation Transformer for Hyperspectral Image Super-Resolution

HAAT: Hybrid Attention Aggregation Transformer for Image Super-Resolution

HAAT: hybrid attention aggregation transformer for image super-resolution

A band grouping-based hybrid convolution for hyperspectral image super-resolution

Spatial-Spectral Aggregation Transformer With Diffusion Prior for Hyperspectral Image Super-Resolution