JOURNAL ARTICLE

Video Object Segmentation with 3D Convolution Network

Abstract

We explore a novel method to realize semi-supervised video object segmentation with special spatiotemporal feature extracting structure. Considering 3-dimension convolution network can convolute a volume of image sequence, it is a distinct way to get both spatial and temporal information. Our network is composed of three parts, the visual module, the motion module and the decoder module. The visual module learns object appearance feature from object in the first frame for network to detect specific object in following image sequences. The motion module aims to get spatiotemporal information of image sequences with 3-dimension convolution network, which learns diversities of object temporal appearance and location. The purpose of decoder module is to get foreground object mask from output of visual module and motion module with concatenation and upsampling structure. We evaluate our model on DAVIS segmentation dataset[15]. Our model doesn't need online training compared with most detection-based methods because of visual module. As a result, it takes 0.14 second per frame to get mask which is 71 times faster than the state-of-art method-OSVOS[2]. Our model also shows better performance than most methods proposed in recent years and its meanIOU accuracy is comparable with state-of-art methods.

Keywords:
Artificial intelligence Computer science Upsampling Computer vision Concatenation (mathematics) Convolution (computer science) Object (grammar) Feature (linguistics) Segmentation Frame (networking) Dimension (graph theory) Object detection Motion (physics) Feature extraction Image segmentation Pattern recognition (psychology) Image (mathematics) Artificial neural network Mathematics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
34
Refs
0.08
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

SpVOS: Efficient Video Object Segmentation With Triple Sparse Convolution

Weihao LinTao ChenChong Yu

Journal:   IEEE Transactions on Image Processing Year: 2023 Vol: 32 Pages: 5977-5991
JOURNAL ARTICLE

Siamese Network with Interactive Transformer for Video Object Segmentation

Meng LanJing ZhangFengxiang HeLefei Zhang

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2022 Vol: 36 (2)Pages: 1228-1236
JOURNAL ARTICLE

Guided Co-Segmentation Network for Fast Video Object Segmentation

Weide LiuGuosheng LinTianyi ZhangZichuan Liu

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2020 Vol: 31 (4)Pages: 1607-1617
© 2026 ScienceGate Book Chapters — All rights reserved.