JOURNAL ARTICLE

Efficient Inductive Vision Transformer for Oriented Object Detection in Remote Sensing Imagery

Cong ZhangJingran SuYakun JuKin‐Man LamQi Wang

Year: 2023 Journal:   IEEE Transactions on Geoscience and Remote Sensing Vol: 61 Pages: 1-20   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Object detection is a fundamental task in remote sensing image analysis and scene understanding. Previous remote sensing object detectors are typically based on convolutional neural networks (CNNs), whose performance is significantly limited by the intrinsic locality of convolution operations. The emergence of vision Transformers brings potential solutions to this problem, which have the capability to be a solid alternative to CNNs. However, three crucial obstacles hinder the application and performance of Transformers in the task of remote sensing object detection, i.e., 1) high computational complexity, especially for high-resolution remote sensing images, 2) training-and sample-inefficiency caused by lack of inductive bias, and 3) difficulty in learning arbitrary orientation knowledge of geospatial objects. To address these issues, in this paper, a novel efficient inductive vision Transformer framework is proposed for oriented object detection in remote sensing imagery. This framework follows the hierarchical feature pyramid structure and makes threefold contributions, as follows. 1) Spatial redundancy in remote sensing images is fully explored and an adaptive multi-grained routing mechanism is proposed to facilitate token sparsity in Transformers, which can dramatically reduce the computational cost without comprising the accuracy. 2) A compact dual-path encoding architecture, where both global long-range dependencies and local semantic relations are jointly and complementarily captured, is proposed to enhance inductive bias in Transformers. 3) An angle tokenization technique is proposed to promote the encoding, embedding, and learning of direction knowledge for oriented objects in remote sensing scenarios. In this work, the above three contributions are instantiated in an advanced Transformer-based object detector, namely EIA-PVT. Comprehensive experiments on two publicly available datasets have demonstrated its effectiveness and superiority for oriented object detection in remote sensing images.

Keywords:
Computer science Artificial intelligence Object detection Transformer Computer vision Convolutional neural network Remote sensing application Inductive bias Pattern recognition (psychology) Hyperspectral imaging Multi-task learning Task (project management) Voltage

Metrics

106
Cited By
23.01
FWCI (Field Weighted Citation Impact)
117
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Remote-Sensing Image Classification
Physical Sciences →  Engineering →  Media Technology
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.