CoseDet: Open-Vocabulary Remote Sensing Object Detection With Contextual Semantic Information

Jiaze Gu; Lulin Fan; Jing Zhao; Xianghai Cao

doi:10.1109/jstars.2025.3622239

ScienceGate Book Chapters

JOURNAL ARTICLE

CoseDet: Open-Vocabulary Remote Sensing Object Detection With Contextual Semantic Information

Jiaze Gu Lulin Fan Jing Zhao Xianghai Cao

Year: 2025 Journal: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Vol: 18 Pages: 26863-26875 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/jstars.2025.3622239

Get Full-Text PDF Get Analytical Report

Abstract

Conventional remote sensing object detection models are predominantly constrained by their dependence on closed-set annotations, limiting their capability to accurately detect objects that were not present in the training dataset. To enable the model to generalize to novel object categories, we propose CoseDet, a novel open-vocabulary object detection framework that integrates the semantic richness of visual-language pretraining model with the precise positioning capability of modern detection architectures. Specifically, CoseDet augments a faster R-CNN detector with a ResNet50-FPN backbone by incorporating RemoteCLIP-based embeddings through a pseudoword mechanism, which aligns high-dimensional visual features with robust textual semantics. Furthermore, a convolutional block attention module is employed to refine feature representations, and explicit modeling of surrounding regions is utilized to capture crucial contextual dependencies. Comprehensive experiments based on four datasets demonstrate that CoseDet not only outperforms state-of-the-art methods but also provides a robust and generalizable solution for open-vocabulary object detection in complex remote sensing scenarios.

Keywords:

Object detection Object (grammar) Block (permutation group theory) Feature (linguistics) Limiting Detector Feature extraction Hyperspectral imaging

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.64

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

CoseDet: Open-Vocabulary Remote Sensing Object Detection With Contextual Semantic Information

Abstract

Metrics

Topics

Related Documents

Semantic-CD: Remote Sensing Image Semantic Change Detection Towards Open-Vocabulary Setting

Open-vocabulary object detection for high-resolution remote sensing images

Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation

Remote sensing small object detection algorithm based on contextual information enhancement

Open-Set Remote Sensing Object Detection Using Edge Information Extraction