Active post-refined multimodality video semantic concept detection with tensor representation

Yanan Liu; Fei Wu; Yueting Zhuang; Jun Xiao

doi:10.1145/1459359.1459372

ScienceGate Book Chapters

JOURNAL ARTICLE

Active post-refined multimodality video semantic concept detection with tensor representation

Yanan Liu Fei Wu Yueting Zhuang Jun Xiao

Year: 2008 Pages: 91-100

DOI: 10.1145/1459359.1459372

Get Full-Text PDF Get Analytical Report

Abstract

In this paper, we resolve the problem of multi-modality video representation and semantic concept detection. Interaction and integration of multi-modality media types such as visual, audio and textual data in video are essential to video semantic analysis. Traditionally, videos are represented as vectors in the Euclidean space. Many learning algorithms are then taken to these vectors in a high dimensional space for dimension reduction, classification, clustering and so on. However, the multiple modalities in video not only have their own properties, but also have correlations among them; whereas the simple vector representation weakens the power of these relatively independent modalities and even ignores their relations to some extent. In this paper, we introduce a higher-order tensor framework for video analysis, in which we represent image, video and text three modalities in video shots as data points by the 3rd-order tensor called tensorshots. We propose a novel dimension reduction method that explicitly considers the manifold structure of the tensor space from multimodal media data which is temporal associated co-occurrence and then detect video semantic concepts through powerful classifiers which take tensor as input. Our algorithm preserves the intrinsic structure of the submanifold where tensorshots are sampled, and is also able to map out-of-sample data points directly. Moreover we apply an active learning based contextual and temporal post-refining strategy to enhance detection accuracy. Experiment results show that our method improves the performance of video semantic concept detection.

Keywords:

Computer science Tensor (intrinsic definition) Artificial intelligence Representation (politics) Modality (human–computer interaction) Structure tensor Dimensionality reduction Pattern recognition (psychology) Dimension (graph theory) Modalities Feature vector Computer vision Image (mathematics) Mathematics

Metrics

Cited By

1.77

FWCI (Field Weighted Citation Impact)

Refs

0.88

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Video Analysis and Summarization

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Human Pose and Action Recognition

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Active post-refined multimodality video semantic concept detection with tensor representation

Abstract

Metrics

Citation History

Topics

Related Documents

Tensor-Based Transductive Learning for Multimodality Video Semantic Concept Detection

Transductive Multi-Modality Video Semantic Concept Detection with Tensor Representation

Semantic Concept Detection for User-Generated Video Content Using a Refined Image Folksonomy

Improving Automatic Video Retrieval with Semantic Concept Detection

Video semantic concept detection using ontology