Exploring Difference Semantic Prior Guidance for Remote Sensing Image Change Captioning

Yunpeng Li; Xiangrong Zhang; Guanchun Wang; T. Zhang

doi:10.3390/rs18020232

ScienceGate Book Chapters

JOURNAL ARTICLE

Exploring Difference Semantic Prior Guidance for Remote Sensing Image Change Captioning

Yunpeng Li Xiangrong Zhang Guanchun Wang T. Zhang

Year: 2026 Journal: Remote Sensing Vol: 18 (2)Pages: 232-232 Publisher: Multidisciplinary Digital Publishing Institute

DOI: 10.3390/rs18020232

Get Full-Text PDF Get Analytical Report

Abstract

Understanding complex change scenes is a crucial challenge in remote sensing field. Remote sensing image change captioning (RSICC) task has emerged as a promising approach to translate appeared changes between bi-temporal remote sensing images into textual descriptions, enabling users to make accurate decisions. Current RSICC methods frequently encounter difficulties in consistency for contextual awareness and semantic prior guidance. Therefore, this study explores difference semantic prior guidance network to reason context-rich sentence for capturing appeared vision changes. Specifically, the context-aware difference module is introduced to guarantee the consistency of unchanged/changed context features, strengthening multi-level changed information to improve the ability of semantic change feature representation. Moreover, to effectively mine higher-level cognition ability to reason salient/weak changes, we employ difference comprehending with shallow change information to realize semantic change knowledge learning. In addition, the designed parallel cross refined attention in Transformer decoder can balance vision difference and semantic knowledge for implicit knowledge distilling, enabling fine-grained perception changes of semantic details and reducing pseudochanges. Compared with advanced algorithms on the LEVIR-CC and Dubai-CC datasets, experimental results validate the outstanding performance of the designed model in RSICC tasks. Notably, on the LEVIR-CC dataset, it reaches a CIDEr score of 143.34%, representing a 3.11% improvement over the most competitive SAT-cap.

Keywords:

Closed captioning Sentence Consistency (knowledge bases) Perception Context (archaeology) Transformer Change detection Semantic feature

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.71

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Exploring Difference Semantic Prior Guidance for Remote Sensing Image Change Captioning

Abstract

Metrics

Topics

Related Documents

Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance

Text-Augmented Semantic Feature Extraction and Difference Information Learning for Remote Sensing Image Change Captioning

A progressive difference enhancement network for remote sensing image change captioning

Change Aware and Semantic Cross-Refine Model for Remote Sensing Image Change Captioning

Exploring Multi-Level Attention and Semantic Relationship for Remote Sensing Image Captioning