Discriminative stream‐weight training for mandarin audio‐visual speech recognition

Guanyong Wu; Jie Zhu

doi:10.1080/02533839.2010.9671667

ScienceGate Book Chapters

JOURNAL ARTICLE

Discriminative stream‐weight training for mandarin audio‐visual speech recognition

Guanyong Wu Jie Zhu

Year: 2010 Journal: Journal of the Chinese Institute of Engineers Vol: 33 (5)Pages: 775-780 Publisher: Taylor & Francis

DOI: 10.1080/02533839.2010.9671667

Get Full-Text PDF Get Analytical Report

Abstract

Abstract In a large vocabulary audio‐visual speech recognition system, to efficiently improve the robustness of the system and reduce the word error rate, two discriminative stream‐weight training methods are provided. The state‐dependent stream weights are trained based on lattice rescoring by the minimum phone error and boosted maximum mutual information using the extended Baum Welch algorithm respectively. Experimental results show considerable error reductions have been achieved by the proposed methods over those using global stream weights. It is also shown that these methods provide better results than the minimum classification error based stream weight training methods.

Keywords:

Discriminative model Computer science Speech recognition Word error rate Robustness (evolution) Mandarin Chinese Vocabulary Artificial intelligence Training (meteorology) Pattern recognition (psychology)

Metrics

Cited By

0.33

FWCI (Field Weighted Citation Impact)

Refs

0.62

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Advanced Adaptive Filtering Techniques

Physical Sciences → Engineering → Computational Mechanics

Blind Source Separation Techniques

Physical Sciences → Computer Science → Signal Processing

Discriminative stream‐weight training for mandarin audio‐visual speech recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Minimum phone error based stream weight training for mandarin audio-visual Speech recognition

Discriminative training of HMM stream exponents for audio-visual speech recognition

Combined discriminative training for multi-stream HMM-based audio-visual speech recognition

Dynamic Stream Weight Modeling for Audio-Visual Speech Recognition

Discriminative HMM stream model for Mandarin digit string speech recognition