Wujie ZhouYuqi CaiLiting ZhangWeiqing YanLu Yu
Mirror segmentation, an emerging discipline in the field of computer vision, involves the identification and marking of mirrors in an image. Current mirror segmentation methods rely on fixed mirror elements as features for object segmentation. However, these methods do not account for the varied quality of feature images obtained under complex real-world conditions, leading to inaccurate segmentation results. To address these limitations, we propose a novel uncertainty-aware transformer localization network (UTLNet) for RGB-D mirror segmentation. Our approach draws inspiration from biomimicry, specifically the behavior pattern of human observation. We aim to explore features from different angles and focus on complex features that are challenging to determine during the coding stage. Additionally, we employ graph convolution to construct complementary dual-modal fusion features. Furthermore, we design a multiscale interaction transformer module using the shifted-window self-attention mechanism to acquire precise position information. In our experiments, the proposed UTLNet surpasses the current state-of-the-art mirror segmentation method as well as alternative task-specific methods. It achieves superior performance across various evaluation scenarios.
Haiyang MeiBo DongWen DongPieter PeersXin YangQiang ZhangXiaopeng Wei
Jianjun LeiDemin TuBo PengJie ZhuZhe ZhangChong WuQingming Huang
Wujie ZhouYuqi CaiXiena DongFangfang QiangWeiwei Qiu