WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations

Peidong Liu; Zibin He; Xiyu Yan; Yong Jiang; Shu‐Tao Xia; Feng Zheng; Maowei Hu

doi:10.1145/3474085.3475217

ScienceGate Book Chapters

JOURNAL ARTICLE

WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations

Peidong Liu Zibin He Xiyu Yan Yong Jiang Shu‐Tao Xia Feng Zheng Maowei Hu

Year: 2021 Pages: 2995-3004

DOI: 10.1145/3474085.3475217

Get Full-Text PDF Get Analytical Report

Abstract

Compared with tedious per-pixel mask annotating, it is much easier to annotate data by clicks, which costs only several seconds for an image. However, applying clicks to learn video semantic segmentation model has not been explored before. In this work, we propose an effective weakly-supervised video semantic segmentation pipeline with click annotations, called WeClick, for saving laborious annotating effort by segmenting an instance of the semantic class with only a single click. Since detailed semantic information is not captured by clicks, directly training with click labels leads to poor segmentation predictions. To mitigate this problem, we design a novel memory flow knowledge distillation strategy to exploit temporal information (named memory flow) in abundant unlabeled video frames, by distilling the neighboring predictions to the target frame via estimated motion. Moreover, we adopt vanilla knowledge distillation for model compression. In this case, WeClick learns compact video semantic segmentation models with the low-cost click annotations during the training phase yet achieves real-time and accurate models during the inference period. Experimental results on Cityscapes and Camvid show that WeClick outperforms the state-of-the-art methods, increases performance by 10.24% mIoU than baseline, and achieves real-time execution.

Keywords:

Computer science Segmentation Pipeline (software) Artificial intelligence Inference Exploit Natural language processing Machine learning Pattern recognition (psychology) Computer vision Programming language

Metrics

Cited By

0.82

FWCI (Field Weighted Citation Impact)

Refs

0.74

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations

Abstract

Metrics

Citation History

Topics

Related Documents

Weakly-Supervised Ultrasound Video Segmentation with Minimal Annotations

Seminar Learning for Click-Level Weakly Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation with Extremely Sparse Annotations for Land Cover Mapping

Weakly Supervised Semantic Segmentation Learning on UAV Video Sequences

Weakly-Supervised Medical Image Segmentation with Gaze Annotations