Statistical Audio-Visual Data Fusion for Video Scene Segmentation

V. M. Parshin; Liming Chen

doi:10.4018/978-1-59904-370-8.ch004

ScienceGate Book Chapters

BOOK-CHAPTER

Statistical Audio-Visual Data Fusion for Video Scene Segmentation

V. M. Parshin Liming Chen

Year: 2011 IGI Global eBooks Pages: 68-89 Publisher: IGI Global

DOI: 10.4018/978-1-59904-370-8.ch004

Get Full-Text PDF Get Analytical Report

Abstract

Automatic video segmentation into semantic units is important to organize an effective content based access to long video. In this work we focus on the problem of video segmentation into narrative units called scenes - aggregates of shots unified by a common dramatic event or locale. In this work we derive a statistical video scene segmentation approach which detects scenes boundaries in one pass fusing multi-modal audio-visual features in a symmetrical and scalable manner. The approach deals properly with the variability of real-valued features and models their conditional dependence on the context. It also integrates prior information concerning the duration of scenes. Two kinds of features extracted in visual and audio domain are proposed. The results of experimental evaluations carried out on ground truth video are reported. They show that our approach effectively fuse multiple modalities with higher performance as compared with an alternative rule-based fusion technique.

Keywords:

Computer science Artificial intelligence Segmentation Computer vision Ground truth Audio visual Context (archaeology) Modal Fuse (electrical) Pattern recognition (psychology) Multimedia Geography

Metrics

Cited By

1.53

FWCI (Field Weighted Citation Impact)

Refs

0.83

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Video Analysis and Summarization

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Statistical Audio-Visual Data Fusion for Video Scene Segmentation

Abstract

Metrics

Citation History

Topics

Related Documents

Statistical Audio-Visual Data Fusion for Video Scene Segmentation

Multimodal Data Fusion for Video Scene Segmentation

VASD: Video Action Scene Detector Using Audio Visual Data

Video scene segmentation using video and audio features

Fusion of Audio-Visual Features and Statistical Property for Commercial Segmentation