JOURNAL ARTICLE

Reliability-Guided Hierarchical Memory Network for Scribble-Supervised Video Object Segmentation

Zikun ZhouKaige MaoWenjie PeiHongpeng WangYaowei WangZhenyu He

Year: 2024 Journal:   IEEE Transactions on Neural Networks and Learning Systems Vol: 36 (4)Pages: 7514-7528   Publisher: Institute of Electrical and Electronics Engineers

Abstract

This article aims to solve the video object segmentation (VOS) task in a scribble-supervised manner, in which VOS models are not only initialized with sparse target scribbles for inference but also trained by sparse scribble annotations. Thus, the annotation burdens for both initialization and training can be substantially lightened. The difficulties of scribble-supervised VOS lie in two aspects: 1) it demands a strong reasoning ability to carefully segment the target given only a sparse initial target scribble and 2) it necessitates learning dense prediction from sparse scribble annotations during training, requiring powerful learning capability. In this work, we propose a reliability-guided hierarchical memory network (RHMNet) for this task, which segments the target in a stepwise expanding strategy w.r.t. the memory reliability level. To be specific, RHMNet maintains a reliability-guided memory bank. It first uses the high-reliability memory to locate the region with high reliability belonging to the target, i.e., highly similar to the initial target scribble. Then, it expands the located high-reliability region to the entire target conditioned on the region itself and all existing memories. In addition, we propose a scribble-supervised learning mechanism to facilitate the model learning for dense prediction. It exploits the pixel-level relations within a single frame and the instance-level variations across multiple frames to take full advantage of the scribble annotations in sequence training samples. The favorable performance on four popular benchmarks demonstrates that our method is promising. Our project is available at: https://github.com/mkg1204/RHMNet-for-SSVOS.

Keywords:
Computer science Initialization Artificial intelligence Reliability (semiconductor) Task (project management) Inference Segmentation Object (grammar) Supervised learning Machine learning Pattern recognition (psychology) Artificial neural network

Metrics

4
Cited By
2.12
FWCI (Field Weighted Citation Impact)
90
Refs
0.79
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Scribble-Supervised Video Object Segmentation

Peiliang HuangJunwei HanNian LiuJun RenDingwen Zhang

Journal:   IEEE/CAA Journal of Automatica Sinica Year: 2021 Vol: 9 (2)Pages: 339-353
JOURNAL ARTICLE

Scribble-Supervised Video Object Segmentation via Scribble Enhancement

Xingyu GaoZhitian LiHailong ShiZhenyu ChenPeilin Zhao

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2025 Vol: 35 (4)Pages: 2999-3012
JOURNAL ARTICLE

Hierarchical Memory Matching Network for Video Object Segmentation

Hongje SeongSeoung Wug OhJoon‐Young LeeSeongwon LeeSuhyeon LeeEuntai Kim

Journal:   2021 IEEE/CVF International Conference on Computer Vision (ICCV) Year: 2021 Pages: 12869-12878
© 2026 ScienceGate Book Chapters — All rights reserved.