Compensation for speech recognition in degraded acoustical environments

Richard M. Stern; Pedro J. Moreno; Bhiksha Raj

doi:10.1121/1.416497

ScienceGate Book Chapters

JOURNAL ARTICLE

Compensation for speech recognition in degraded acoustical environments

Richard M. Stern Pedro J. Moreno Bhiksha Raj

Year: 1996 Journal: The Journal of the Acoustical Society of America Vol: 100 (4_Supplement)Pages: 2792-2792 Publisher: Acoustical Society of America

DOI: 10.1121/1.416497

Get Full-Text PDF Get Analytical Report

Abstract

The accuracy of speech recognition systems degrades when operated in adverse acoustical environments. This paper discusses two ways in which more detailed mathematical descriptions of the effects of environmental degradation can improve speech recognition accuracy using both ‘‘data-driven’’ and ‘‘model-based’’ compensation strategies. Data-driven methods learn environmental characteristics through direct comparisons of speech recorded in the noisy environment with the same speech recorded under optimal conditions. Model-based methods use a mathematical model of the environment and attempt to use samples of the degraded speech to estimate model parameters. Two approaches to data-driven compensation, RATZ and STAR, are described, as well as a new approach to model-based compensation, referred to as the vector Taylor series (VTS) algorithm. Compensation algorithms are evaluated in a series of experiments measuring recognition accuracy for speech from the ARPA Wall Street Journal database that is corrupted by artificially added noise at various signal-to-noise ratios (SNRs). For any particular SNR, the greatest recognition accuracy obtained using a practical compensation algorithm is observed when that system is trained using noisy data at that SNR. The RATZ, VTS, and STAR algorithms achieve this bound at global SNRs as low as 15, 10, and 5 dB, respectively. [Work supported by ARPA.]

Keywords:

Compensation (psychology) Computer science Noise (video) Speech recognition Degradation (telecommunications) SIGNAL (programming language) Environmental noise Series (stratigraphy) Artificial intelligence Pattern recognition (psychology) Acoustics Telecommunications Sound (geography)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.14

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Compensation for speech recognition in degraded acoustical environments

Abstract

Metrics

Citation History

Topics

Related Documents

polyaural array processing for automatic speech recognition in degraded environments

Noise compensation for speech recognition in car noise environments

Feature compensation technique for robust speech recognition in noisy environments

Single-channel speech dereverberation in acoustical environments

Individual differences in speech recognition for older adults in noisy and degraded environments