Twin-HMM-based audio-visual speech enhancement

Ahmed Hussen Abdelaziz; Steffen Zeiler; Dorothea Kolossa

doi:10.1109/icassp.2013.6638354

ScienceGate Book Chapters

JOURNAL ARTICLE

Twin-HMM-based audio-visual speech enhancement

Ahmed Hussen Abdelaziz Steffen Zeiler Dorothea Kolossa

Year: 2013 Pages: 3726-3730

DOI: 10.1109/icassp.2013.6638354

Get Full-Text PDF Get Analytical Report

Abstract

Most approaches for speech signal processing rely solely on acoustic input, which has the consequence that spectrum estimation becomes exceedingly difficult when the signal-to-noise ratio drops to values near 0 dB. However, alternative sources of information are becoming widely available with increasing use of multimedia data in everyday communication. In the following paper, we suggest to use video input as an auxiliary modality for speech processing by applying a new statistical model - the twin hidden Markov model. The resulting enhancement algorithm for audiovisual data greatly outperforms the standard audio-only log-MMSE estimator on all considered instrumental speech quality measures covering spectral and perceptual quality.

Keywords:

Hidden Markov model Computer science Speech recognition Estimator Speech processing Speech enhancement Modality (human–computer interaction) Noise (video) Audio signal processing SIGNAL (programming language) Markov model Sound quality Signal-to-noise ratio (imaging) Speech coding Audio signal Artificial intelligence Markov chain Noise reduction Machine learning Mathematics Statistics Image (mathematics) Telecommunications

Metrics

Cited By

2.36

FWCI (Field Weighted Citation Impact)

Refs

0.90

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Advanced Adaptive Filtering Techniques

Physical Sciences → Engineering → Computational Mechanics

Twin-HMM-based audio-visual speech enhancement

Abstract

Metrics

Citation History

Topics

Related Documents

Introducing the Turbo-Twin-HMM for Audio-Visual Speech Enhancement

Using twin-HMM-based audio-visual speech enhancement as a front-end for robust audio-visual speech recognition

HMM-based text-to-audio-visual speech synthesis

Inventory-based audio-visual speech enhancement

HMM modeling for audio-visual speech recognition