JOURNAL ARTICLE

High-performance robust speech recognition using stereo training data

Abstract

We describe a novel technique of SPLICE (Stereo-based Piecewise Linear Compensation for Environments) for high performance robust speech recognition. It is an efficient noise reduction and channel distortion compensation technique that makes effective use of stereo training data. We present a new version of SPLICE using the minimum-mean-square-error decision, and describe an extension by training clusters of hidden Markov models (HMMs) with SPLICE processing. Comprehensive results using a Wall Street Journal large vocabulary recognition task and with a wide range of noise types demonstrate the superior performance of the SPLICE technique over that under noisy matched conditions (19% word error rate reduction). The new technique is also shown to consistently outperform the spectral-subtraction noise reduction technique, and is currently being integrated into the Microsoft MiPad, a new generation PDA prototype.

Keywords:
Computer science Speech recognition Training (meteorology) Training set Artificial intelligence Pattern recognition (psychology) Computer vision

Metrics

124
Cited By
6.28
FWCI (Field Weighted Citation Impact)
11
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Data Compression Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.