Zefeng ChenZhijiang LiY. Y. XueL. Zhang
ABSTRACT Camouflaged object detection (COD) aims to identify and segment objects that closely resemble and are seamlessly integrated into their surrounding environments, making it a challenging task in computer vision. COD is constrained by the limited availability of training data and annotated samples, and most carefully designed COD models exhibit diminished performance under low‐data conditions. In recent years, there has been increasing interest in leveraging foundation models, which have demonstrated robust general capabilities and superior generalisation performance, to address COD challenges. This work proposes a knowledge‐guided domain adaptation (KGDA) approach to tackle the data scarcity problem in COD. The method utilises the knowledge descriptions generated by multimodal large language models (MLLMs) for camouflaged images, aiming to enhance the model's comprehension of semantic objects and camouflaged scenes through highly abstract and generalised knowledge representations. To resolve ambiguities and errors in the generated text descriptions, a multi‐level knowledge aggregation (MLKG) module is devised. This module consolidates consistent semantic knowledge and forms multi‐level semantic knowledge features. To incorporate semantic knowledge into the visual foundation model, the authors introduce a knowledge‐guided semantic enhancement adaptor (KSEA) that integrates the semantic knowledge of camouflaged objects while preserving the original knowledge of the foundation model. Extensive experiments demonstrate that our method surpasses 19 state‐of‐the‐art approaches and exhibits strong generalisation capabilities even with limited annotated data.
Zhennan ChenRongrong GaoTian-Zhu XiangLin Fan
Yong WangLing LiXin YangXinxin WangHui Liu
Deng-Ping FanGe-Peng JiGuolei SunMing‐Ming ChengJianbing ShenLing Shao
Zhouyong LiuTaotao JiChunguo LiLuxi Yang