High-performance robust speech recognition using stereo training data

Li Deng; Alex Acero; Lì Jiāng; Jasha Droppo; Xuedong Huang

doi:10.1109/icassp.2001.940827

ScienceGate Book Chapters

JOURNAL ARTICLE

High-performance robust speech recognition using stereo training data

Li Deng Alex Acero Lì Jiāng Jasha Droppo Xuedong Huang

Year: 2002 Vol: 1 Pages: 301-304

DOI: 10.1109/icassp.2001.940827

Get Full-Text PDF Get Analytical Report

Abstract

We describe a novel technique of SPLICE (Stereo-based Piecewise Linear Compensation for Environments) for high performance robust speech recognition. It is an efficient noise reduction and channel distortion compensation technique that makes effective use of stereo training data. We present a new version of SPLICE using the minimum-mean-square-error decision, and describe an extension by training clusters of hidden Markov models (HMMs) with SPLICE processing. Comprehensive results using a Wall Street Journal large vocabulary recognition task and with a wide range of noise types demonstrate the superior performance of the SPLICE technique over that under noisy matched conditions (19% word error rate reduction). The new technique is also shown to consistently outperform the spectral-subtraction noise reduction technique, and is currently being integrated into the Microsoft MiPad, a new generation PDA prototype.

Keywords:

Computer science Speech recognition Training (meteorology) Training set Artificial intelligence Pattern recognition (psychology) Computer vision

Metrics

124

Cited By

6.28

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Data Compression Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

High-performance robust speech recognition using stereo training data

Abstract

Metrics

Citation History

Topics

Related Documents

Multi-style training of HMMS with stereo data for reverberation-robust speech recognition

Limited Training Data Robust Speech Recognition Using Kernel-Based Acoustic Models

Stereo-based stochastic mapping with discriminative training for noise robust speech recognition

Cepstral Vector Normalization Based on Stereo Data for Robust Speech Recognition

Stereo-Based Stochastic Mapping for Robust Speech Recognition