JOURNAL ARTICLE

Remote sensing scene classification with masked image modeling

Abstract

Remote sensing scene classification has been extensively studied for its critical roles in geological survey, oil exploration, traffic management, earthquake prediction, wildfire monitoring, and intelligence monitoring. In the past, the Machine Learning (ML) methods for performing the task mainly used the backbones pretrained in the manner of supervised learning (SL). As Masked Image Modeling (MIM), a self-supervised learning (SSL) technique, has been shown as a better way for learning visual feature representation, it presents a new opportunity for improving ML performance on the scene classification task. This research aims to explore the potential of MIM pretrained backbones on four well-known classification datasets: Merced, AID, NWPU-RESISC45, and Optimal-31. Compared to the published benchmarks, we show that the MIM pretrained Vision Transformer (ViTs) backbones outperform other alternatives (up to 18% on top 1 accuracy) and that the MIM technique can learn better feature representation than the supervised learning counterparts (up to 5% on top 1 accuracy). Moreover, we show that the general-purpose MIM-pretrained ViTs can achieve competitive performance as the specially designed yet complicated Transformer for Remote Sensing (TRS) framework. Our experiment results also provide a performance baseline for future studies.

Keywords:
Computer science Transformer Artificial intelligence Feature learning Task (project management) Machine learning Feature (linguistics) Representation (politics) Supervised learning Feature extraction Pattern recognition (psychology) Artificial neural network Engineering

Metrics

2
Cited By
0.43
FWCI (Field Weighted Citation Impact)
105
Refs
0.63
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Remote-Sensing Image Classification
Physical Sciences →  Engineering →  Media Technology
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

FishermaskFormer: Lightweight Remote Sensing Scene Classification With Masked Transformer

Wei WuXianbin HuZhu LiXueliang Luo

Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Year: 2025 Vol: 18 Pages: 15829-15844
JOURNAL ARTICLE

Remote Sensing Image Scene Classification

Md. Arafat HussainEmon Kumar Dey

Journal:   International Journal of Engineering and Manufacturing Year: 2018 Vol: 8 (4)Pages: 13-20
JOURNAL ARTICLE

Learning scene-vectors for remote sensing image scene classification

Rajeshreddy DatlaNazil PerveenC. Krishna Mohan

Journal:   Neurocomputing Year: 2024 Vol: 587 Pages: 127679-127679
© 2026 ScienceGate Book Chapters — All rights reserved.