Automatic speech recognition using audio visual cues

H R Yashwanth; H. Mahendrakar; S. Sumam David

doi:10.1109/indico.2004.1497730

ScienceGate Book Chapters

JOURNAL ARTICLE

Automatic speech recognition using audio visual cues

H R Yashwanth H. Mahendrakar S. Sumam David

Year: 2005 Pages: 166-169

DOI: 10.1109/indico.2004.1497730

Get Full-Text PDF Get Analytical Report

Abstract

Automatic speech recognition (ASR) systems have been able to gain much popularity since many multimedia applications require robust speech recognition algorithms. The use of audio and visual information in the speaker-independent continuous speech recognition process makes the performance of the system better compared to the ones with only the audio information. There has been a marked increase in the recognition rates by the use of visual data to aid the audio data available. This is due to the fact that video information is less susceptible to ambient noise than audio information. In this paper a robust audio-video speech recognition (AVSR) system that allows us to incorporate the coupled hidden Markov model (CHMM) model for fusion of audio and video modalities is presented. The application records the input data and recognizes the isolated words in the input file over a wide range of signal to noise ratio (SNR). The experimental results show a remarkable increase of about 10% in the recognition rate in the AVSR compared to the audio only ASR and 20% compared to the video only ASR for an SNR of 5 dB.

Keywords:

Speech recognition Computer science Audio mining Hidden Markov model Speech coding Speaker recognition Acoustic model Voice activity detection Noise (video) Artificial intelligence Speech processing Image (mathematics)

Metrics

Cited By

0.65

FWCI (Field Weighted Citation Impact)

Refs

0.72

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Multisensory perception and integration

Social Sciences → Psychology → Experimental and Cognitive Psychology

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Automatic speech recognition using audio visual cues

Abstract

Metrics

Citation History

Topics

Related Documents

AUTOMATIC SPEECH RECOGNITION THAT INCLUDES VISUAL SPEECH CUES

Intermodal Timing Cues for Audio-Visual Speech Recognition

Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Speech Recognition

Synthesis and automatic recognition of audio-visual speech

Audio-Visual Automatic Speech Recognition for Connected Digits