JOURNAL ARTICLE

Compensation for speech recognition in degraded acoustical environments

Richard M. SternPedro J. MorenoBhiksha Raj

Year: 1996 Journal:   The Journal of the Acoustical Society of America Vol: 100 (4_Supplement)Pages: 2792-2792   Publisher: Acoustical Society of America

Abstract

The accuracy of speech recognition systems degrades when operated in adverse acoustical environments. This paper discusses two ways in which more detailed mathematical descriptions of the effects of environmental degradation can improve speech recognition accuracy using both ‘‘data-driven’’ and ‘‘model-based’’ compensation strategies. Data-driven methods learn environmental characteristics through direct comparisons of speech recorded in the noisy environment with the same speech recorded under optimal conditions. Model-based methods use a mathematical model of the environment and attempt to use samples of the degraded speech to estimate model parameters. Two approaches to data-driven compensation, RATZ and STAR, are described, as well as a new approach to model-based compensation, referred to as the vector Taylor series (VTS) algorithm. Compensation algorithms are evaluated in a series of experiments measuring recognition accuracy for speech from the ARPA Wall Street Journal database that is corrupted by artificially added noise at various signal-to-noise ratios (SNRs). For any particular SNR, the greatest recognition accuracy obtained using a practical compensation algorithm is observed when that system is trained using noisy data at that SNR. The RATZ, VTS, and STAR algorithms achieve this bound at global SNRs as low as 15, 10, and 5 dB, respectively. [Work supported by ARPA.]

Keywords:
Compensation (psychology) Computer science Noise (video) Speech recognition Degradation (telecommunications) SIGNAL (programming language) Environmental noise Series (stratigraphy) Artificial intelligence Pattern recognition (psychology) Acoustics Telecommunications Sound (geography)

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.14
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.