JOURNAL ARTICLE

Acoustic super models for large scale video event detection

Abstract

Given the exponential growth of videos published on the Internet, mechanisms for clustering, searching, and browsing large numbers of videos have become a major research area. More importantly, there is a demand for event detectors that go beyond the simple finding of objects but rather detect more abstract concepts, such as "feeding an animal" or a "wedding ceremony". This article presents an approach for event classification that enables searching for arbitrary events, including more abstract concepts, in found video collections based on the analysis of the audio track. The approach does not rely on speech processing, and is language-indepent, instead it generates models for a set of example query videos using a mixture of two types of audio features: Linear-Frequency Cepstral Coefficients and Modulation Spectrogram Features. This approach can be used in complement with video analysis and requires no domain specific tagging. Application of the approach to the TRECVid MED 2011 development set, which consists of more than 4000 random "wild" videos from the Internet, has shown a detection accuracy of 64% including those videos which do not contain an audio track.

Keywords:
Computer science Spectrogram Event (particle physics) Set (abstract data type) Cluster analysis Speech recognition Complement (music) Mel-frequency cepstrum The Internet Detector Artificial intelligence Pattern recognition (psychology) Feature extraction

Metrics

20
Cited By
3.73
FWCI (Field Weighted Citation Impact)
12
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Large-Scale Video Event Detection

Guangnan Ye

Journal:   Columbia Academic Commons (Columbia University) Year: 2015
BOOK-CHAPTER

Large-Scale Video Event Detection Using Deep Neural Networks

Guangnan Ye

Auerbach Publications eBooks Year: 2018 Pages: 1-23
© 2026 ScienceGate Book Chapters — All rights reserved.