Li DengAlex AceroLì JiāngJasha DroppoXuedong Huang
We describe a novel technique of SPLICE (Stereo-based Piecewise Linear Compensation for Environments) for high performance robust speech recognition. It is an efficient noise reduction and channel distortion compensation technique that makes effective use of stereo training data. We present a new version of SPLICE using the minimum-mean-square-error decision, and describe an extension by training clusters of hidden Markov models (HMMs) with SPLICE processing. Comprehensive results using a Wall Street Journal large vocabulary recognition task and with a wide range of noise types demonstrate the superior performance of the SPLICE technique over that under noisy matched conditions (19% word error rate reduction). The new technique is also shown to consistently outperform the spectral-subtraction noise reduction technique, and is currently being integrated into the Microsoft MiPad, a new generation PDA prototype.
Armin SehrChristian HofmannRoland MaasWalter Kellermann
Martin SchaffönerSven E. KrügerEdin AndelicMarcel KatzAndreas Wendemuth
Xiaodong CuiMohamed AfifyYuqing Gao
Luís BueraEduardo LleidaAntonio MiguelAlfonso Ortegascar Saz
Mohamed AfifyXiaodong CuiYuqing Gao