Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Ho Kei Cheng; Yu‐Wing Tai; Chi–Keung Tang

doi:10.1109/cvpr46437.2021.00551

ScienceGate Book Chapters

JOURNAL ARTICLE

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Ho Kei Cheng Yu‐Wing Tai Chi–Keung Tang

Year: 2021

DOI: 10.1109/cvpr46437.2021.00551

Get Full-Text PDF Get Analytical Report

Abstract

We present Modular interactive VOS (MiVOS) framework which decouples interaction-to-mask and mask propagation, allowing for higher generalizability and better performance. Trained separately, the interaction module converts user interactions to an object mask, which is then temporally propagated by our propagation module using a novel top-k filtering strategy in reading the space-time memory. To effectively take the user's intent into account, a novel difference-aware module is proposed to learn how to properly fuse the masks before and after each interaction, which are aligned with the target frames by employing the space-time memory. We evaluate our method both qualitatively and quantitatively with different forms of user interactions (e.g., scribbles, clicks) on DAVIS to show that our method outperforms current state-of-the-art algorithms while requiring fewer frame interactions, with the additional advantage in generalizing to different types of user interactions. We contribute a large-scale synthetic VOS dataset with pixel-accurate segmentation of 4.8M frames to accompany our source codes to facilitate future research.

Keywords:

Computer science Modular design Fuse (electrical) Object (grammar) Segmentation Interaction technique Generalizability theory Frame (networking) Computer vision Artificial intelligence Human–computer interaction Gesture

Metrics

174

Cited By

14.11

FWCI (Field Weighted Citation Impact)

Refs

0.99

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Vision and Imaging

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Abstract

Metrics

Citation History

Topics

Related Documents

Regional Video Object Segmentation by Efficient Motion-Aware Mask Propagation

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

Video Object Segmentation with Joint Re-identification and Attention-Aware Mask Propagation

Attention-Guided Mask Propagation for Video Object Segmentation

Attention-Guided Mask Propagation for Video Object Segmentation