JOURNAL ARTICLE

CSiT: A Multiscale Vision Transformer for Hyperspectral Image Classification

Wenxuan HeWeiliang HuangShuhong LiaoZhen XuJingwen Yan

Year: 2022 Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Vol: 15 Pages: 9266-9277   Publisher: Institute of Electrical and Electronics Engineers

Abstract

The hyperspectral image (HSI) has nearly continuous spectral information; thus, the target of interest can be accurately identified by the subtle details of spectral properties. Spectral resolution at different scales can capture different levels of spectral features: Small-scale spectral bands are beneficial for extracting global details in vision transformers, while large-scale spectral bands are more effective for local features. Transformer shows advantages in global information extraction with self-attention module and even surpasses convolutional neural network (CNNs) in various tasks. Some works based on the vision transformer have performed surprisingly in HSI classification. However, single-scale vision transformers are insufficient to balance the extraction of local details and redundancy on different scales. The recent work, a multiscale vision transformer, has provided a solution with spatial patch-wise features in image classification. Inspired by this, we propose the cross-spectral vision transformer (CSiT) with two branches to extract pixel-wise multiscale features and further design a multiscale spectral embedding module to enhance local details between neighboring spectral bands. Moreover, based on the cross-attention operation, a single token for each branch is recognized as a query and used to exchange information with other branches. We evaluate the classification performance of the proposed CSiT in three classic HSI datasets with extensive experiments, showing the multiscale vision transformer architecture has a promising result for HSI classification with 1-D spectral bands.

Keywords:
Hyperspectral imaging Artificial intelligence Computer science Computer vision Transformer Pattern recognition (psychology) Pixel Redundancy (engineering) Spectral bands Embedding Feature extraction Remote sensing Engineering Geography

Metrics

38
Cited By
5.17
FWCI (Field Weighted Citation Impact)
50
Refs
0.95
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Remote-Sensing Image Classification
Physical Sciences →  Engineering →  Media Technology
Remote Sensing and Land Use
Physical Sciences →  Earth and Planetary Sciences →  Atmospheric Science
Advanced Image Fusion Techniques
Physical Sciences →  Engineering →  Media Technology

Related Documents

JOURNAL ARTICLE

Multiscale Sample Transformer for Hyperspectral Image Classification

Weitao ZhangNuo XuYv BaiYaru Zhang

Journal:   IEEE Transactions on Geoscience and Remote Sensing Year: 2025 Vol: 63 Pages: 1-18
JOURNAL ARTICLE

Multiscale Super Token Transformer for Hyperspectral Image Classification

Zhe MengTaizheng ZhangFeng ZhaoGaige ChenMiaomiao Liang

Journal:   IEEE Geoscience and Remote Sensing Letters Year: 2024 Vol: 21 Pages: 1-5
JOURNAL ARTICLE

Masked Vision Transformer for Fast Hyperspectral Image Classification

Liguo WangHeng WangShoulin YinLifeng Wang

Journal:   IEEE Transactions on Geoscience and Remote Sensing Year: 2025 Vol: 63 Pages: 1-16
JOURNAL ARTICLE

Hybrid Vision Transformer Model for Hyperspectral Image Classification

Jiaqi YangBo DuChen Wu

Journal:   IGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium Year: 2022 Pages: 1388-1391
© 2026 ScienceGate Book Chapters — All rights reserved.