JOURNAL ARTICLE

Fine-Grained Information Supplementation and Value-Guided Learning for Remote Sensing Image-Text Retrieval

Zihui ZhouYong FengAgen QiuGuangyao DuanMingliang Zhou

Year: 2024 Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Vol: 17 Pages: 19194-19210   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Remote sensing (RS) image-text retrieval is a practical and challenging task that has received considerable attention. Currently, most approaches rely on either convolutional neural networks or Transformers, which cannot effectively extract both global and fine-grained features simultaneously. Furthermore, the problem of high intramodal similarity in the RS domain poses a challenge for feature learning. In addition, the characteristics of model training at different stages seem to be neglected in most studies. In order to tackle these problems, we propose a fine-grained information supplementation (FGIS) and value-guided learning model that leverages prior knowledge in the RS domain for feature supplementation and employs a value-guided training approach to learn fine-grained, expressive, and robust feature representations. Specifically, we introduce the FGIS module to facilitate the supplementation of fine-grained visual features, thereby enhancing perceptual abilities for both global and local features. Furthermore, we mitigate the problem of high intra-modal similarity by proposing two loss functions: the weighted contrastive loss and the scene-adaptive fine-grained perceptual loss. Finally, we design a value-guided learning framework that focuses on the most important information at each stage of training. Extensive experiments on the remote sensing image captioning dataset (RSICD) and remote sensing image text match dataset (RSITMD) datasets verify the effectiveness and superiority of our model.

Keywords:
Computer science Image retrieval Value (mathematics) Image (mathematics) Remote sensing Information retrieval Artificial intelligence Computer vision Machine learning Geography

Metrics

4
Cited By
2.12
FWCI (Field Weighted Citation Impact)
71
Refs
0.81
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.