JOURNAL ARTICLE

Exploring Difference Semantic Prior Guidance for Remote Sensing Image Change Captioning

Yunpeng LiXiangrong ZhangGuanchun WangT. Zhang

Year: 2026 Journal:   Remote Sensing Vol: 18 (2)Pages: 232-232   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Understanding complex change scenes is a crucial challenge in remote sensing field. Remote sensing image change captioning (RSICC) task has emerged as a promising approach to translate appeared changes between bi-temporal remote sensing images into textual descriptions, enabling users to make accurate decisions. Current RSICC methods frequently encounter difficulties in consistency for contextual awareness and semantic prior guidance. Therefore, this study explores difference semantic prior guidance network to reason context-rich sentence for capturing appeared vision changes. Specifically, the context-aware difference module is introduced to guarantee the consistency of unchanged/changed context features, strengthening multi-level changed information to improve the ability of semantic change feature representation. Moreover, to effectively mine higher-level cognition ability to reason salient/weak changes, we employ difference comprehending with shallow change information to realize semantic change knowledge learning. In addition, the designed parallel cross refined attention in Transformer decoder can balance vision difference and semantic knowledge for implicit knowledge distilling, enabling fine-grained perception changes of semantic details and reducing pseudochanges. Compared with advanced algorithms on the LEVIR-CC and Dubai-CC datasets, experimental results validate the outstanding performance of the designed model in RSICC tasks. Notably, on the LEVIR-CC dataset, it reaches a CIDEr score of 143.34%, representing a 3.11% improvement over the most competitive SAT-cap.

Keywords:
Closed captioning Sentence Consistency (knowledge bases) Perception Context (archaeology) Transformer Change detection Semantic feature

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
28
Refs
0.71
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.