Video semantic segmentation using deep multi-view representation learning

Akrem Sellami; Salvatore Tabbone

doi:10.1109/icpr48806.2021.9413239

ScienceGate Book Chapters

JOURNAL ARTICLE

Video semantic segmentation using deep multi-view representation learning

Akrem Sellami Salvatore Tabbone

Year: 2021 Pages: 1-7

DOI: 10.1109/icpr48806.2021.9413239

Get Full-Text PDF Get Analytical Report

Abstract

In this paper, we propose a deep learning model based on deep multi-view representation learning, to address the video object segmentation task. The proposed model emphasizes the importance of the inherent correlation between video frames and incorporates a multi-view representation learning based on deep canonically correlated autoencoders. The multi-view representation learning in our model provides an efficient mechanism for capturing inherent correlations by jointly extracting useful features and learning better representation into a joint feature space, i.e., shared representation. To increase the training data and the learning capacity, we train the proposed model with pairs of video frames, i.e., Fa and Fb. During the segmentation phase, the deep canonically correlated auto encoders model encodes useful features by processing multiple reference frames together, which is used to detect the frequently reappearing. Our model enhances the state-of-the-art deep learning-based methods that mainly focus on learning discriminative foreground representations over appearance and motion. Experimental results over two large benchmarks demonstrate the ability of the proposed method to outperform competitive approaches and to reach good performances, in terms of semantic segmentation.

Keywords:

Computer science Artificial intelligence Feature learning Segmentation Deep learning Discriminative model Representation (politics) Focus (optics) Pattern recognition (psychology) Encoder Feature (linguistics) Autoencoder Object (grammar) Multi-task learning Frame (networking) Computer vision Machine learning Task (project management)

Metrics

Cited By

0.51

FWCI (Field Weighted Citation Impact)

Refs

0.64

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Video semantic segmentation using deep multi-view representation learning

Abstract

Metrics

Citation History

Topics

Related Documents

Multi-view semantic temporal video segmentation

Deep Multi-view Representation Learning for Video Anomaly Detection Using Spatiotemporal Autoencoders

Improving deep learning based semantic segmentation with multi view outliner correction

IMPROVING DEEP LEARNING BASED SEMANTIC SEGMENTATION WITH MULTI VIEW OUTLIER CORRECTION

Multi-Level Representation Learning with Semantic Alignment for Referring Video Object Segmentation