JOURNAL ARTICLE

VASD: Video Action Scene Detector Using Audio Visual Data

Abstract

This paper presents a method which able to integrate audio and visual information for human action scene analysis. The approach is top-down for determining and extracting action scenes in video by analyzing both audio and video data. We proposed a framework for recognizing actions by measuring image and action-based information from video with the following characteristics: feature extraction is done automatically; the method deals with both visual and auditory information, and captures both spatial and temporal characteristics; and the extracted features are natural, in the sense that they are closely related to the human perceptual processing. Our effort was to implementing idea of action identification by extracting syntactic properties of a video such as edge feature extraction, colour distribution, audio and motion vectors. In this paper, we present a simple method for human activity recognition based on a Hidden Markov models (HMMs) for sensing, learning and training the actions. In addition, we used audio visual features to distinguish the human actions and to reach a decision. We describe the use of the model that diagnoses states of a human activity based on events from video. We reviewed the model, present an implementation, and report on experiments to demonstrate the robustness of the framework.

Keywords:
Computer science Hidden Markov model Artificial intelligence Feature extraction Robustness (evolution) Computer vision Perception Pattern recognition (psychology) Speech recognition

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
15
Refs
0.11
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.