JOURNAL ARTICLE

Masked Vision Transformers for Hyperspectral Image Classification

Abstract

Transformer architectures have become state-of-the-art models in computer vision and natural language processing. To a significant degree, their success can be attributed to self-supervised pre-training on large scale unlabeled datasets. This work investigates the use of self-supervised masked image reconstruction to advance transformer models for hyperspectral remote sensing imagery. To facilitate self-supervised pre-training, we build a large dataset of unlabeled hyperspectral observations from the EnMAP satellite and systematically investigate modifications of the vision transformer architecture to optimally leverage the characteristics of hyperspectral data. We find significant improvements in accuracy on different land cover classification tasks over both standard vision and sequence transformers using (i) blockwise patch embeddings, (ii) spatialspectral self-attention, (iii) spectral positional embeddings and (iv) masked self-supervised pre-training 1 . The resulting model outperforms standard transformer architectures by +5% accuracy on a labeled subset of our EnMAP data and by +15% on Houston2018 hyperspectral dataset, making it competitive with a strong 3D convolutional neural network baseline. In an ablation study on label-efficiency based on the Houston2018 dataset, self-supervised pre-training significantly improves transformer accuracy when little labeled training data is available. The self-supervised model outperforms randomly initialized transformers and the 3D convolutional neural network by +7-8% when only 0.1-10% of the training labels are available.

Keywords:
Hyperspectral imaging Computer science Artificial intelligence Convolutional neural network Transformer Pattern recognition (psychology) Leverage (statistics) Machine learning Engineering

Metrics

72
Cited By
15.63
FWCI (Field Weighted Citation Impact)
71
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Remote-Sensing Image Classification
Physical Sciences →  Engineering →  Media Technology
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Remote Sensing and Land Use
Physical Sciences →  Earth and Planetary Sciences →  Atmospheric Science

Related Documents

JOURNAL ARTICLE

Masked Vision Transformer for Fast Hyperspectral Image Classification

Liguo WangHeng WangShoulin YinLifeng Wang

Journal:   IEEE Transactions on Geoscience and Remote Sensing Year: 2025 Vol: 63 Pages: 1-16
JOURNAL ARTICLE

Nested Transformers for Hyperspectral Image Classification

Zitong ZhangQiaoyu MaHeng ZhouNa Gong

Journal:   Journal of Sensors Year: 2022 Vol: 2022 Pages: 1-16
JOURNAL ARTICLE

Dynamics of Masked Image Modeling in Hyperspectral Image Classification

Chen MaHuayi LiJunjun JiangCésar AybarJiaqi YaoGustau Camps‐Valls

Journal:   IEEE Transactions on Geoscience and Remote Sensing Year: 2025 Vol: 63 Pages: 1-15
© 2026 ScienceGate Book Chapters — All rights reserved.