Shuo WangBreanna ShiNinglian WangYuzhu ZhangYan Zhu
How to fully utilize spectral correlation to improve spatial resolution is the focus of current hyperspectral image super-resolution (HSI-SR) research. Existing approaches either combine deep models with various advanced attention mechanisms to form an end-to-end framework, or concentrate on the problem of modeling prior estimates of spectral bands and space. While most of these methods are designed for supervised learning with paired labels, they may also benefit from self-supervised learning techniques such as masked autoencoders (MAE). This work focuses on the single hyperspectral image super-resolution problem and develops a multiscale masked hybrid convolution-transformer framework. The starting point of this work is an attempt to add a random mask to the input signal to reduce the redundancy of the original features, which the model combines with multiscale representation inference to improve its learning and generalization capabilities. However, we found that simply deploying MAE only for HSI-SR tasks leads to subpar performance. To solve this problem and coordinate with the multiscale network, we propose a multiscale interchannel masking fusion strategy to save computational overhead and bridge the gap between spectral resolution and spatial resolution. Extensive evaluations on three benchmark datasets demonstrate that the proposed method achieves superior performance than state-of-the-art methods.
Jiayang ZhangHongjia QuJunhao JiaYaowei LiBo JiangXiaoxuan ChenJinye Peng
Shi ChenLefei ZhangLiangpei Zhang
Tingting LiuYuan LiuChuncheng ZhangLiyin YuanXiubao SuiQian Chen