Video Object Segmentation with 3D Convolution Network

Huiyun Tang; Pin Tao; Rui Ma; Yuanchun Shi

doi:10.1145/3341016.3341031

ScienceGate Book Chapters

JOURNAL ARTICLE

Video Object Segmentation with 3D Convolution Network

Huiyun Tang Pin Tao Rui Ma Yuanchun Shi

Year: 2019 Vol: 2 Pages: 28-32

DOI: 10.1145/3341016.3341031

Get Full-Text PDF Get Analytical Report

Abstract

We explore a novel method to realize semi-supervised video object segmentation with special spatiotemporal feature extracting structure. Considering 3-dimension convolution network can convolute a volume of image sequence, it is a distinct way to get both spatial and temporal information. Our network is composed of three parts, the visual module, the motion module and the decoder module. The visual module learns object appearance feature from object in the first frame for network to detect specific object in following image sequences. The motion module aims to get spatiotemporal information of image sequences with 3-dimension convolution network, which learns diversities of object temporal appearance and location. The purpose of decoder module is to get foreground object mask from output of visual module and motion module with concatenation and upsampling structure. We evaluate our model on DAVIS segmentation dataset[15]. Our model doesn't need online training compared with most detection-based methods because of visual module. As a result, it takes 0.14 second per frame to get mask which is 71 times faster than the state-of-art method-OSVOS[2]. Our model also shows better performance than most methods proposed in recent years and its meanIOU accuracy is comparable with state-of-art methods.

Keywords:

Artificial intelligence Computer science Upsampling Computer vision Concatenation (mathematics) Convolution (computer science) Object (grammar) Feature (linguistics) Segmentation Frame (networking) Dimension (graph theory) Object detection Motion (physics) Feature extraction Image segmentation Pattern recognition (psychology) Image (mathematics) Artificial neural network Mathematics

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.08

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Video Object Segmentation with 3D Convolution Network

Abstract

Metrics

Topics

Related Documents

SpVOS: Efficient Video Object Segmentation With Triple Sparse Convolution

ACCLVOS: Atrous Convolution with Spatial-Temporal ConvLSTM for Video Object Segmentation

Siamese Network with Interactive Transformer for Video Object Segmentation

Semi-supervised Video Object Segmentation with Recurrent Neural Network

Guided Co-Segmentation Network for Fast Video Object Segmentation