JOURNAL ARTICLE

Unsupervised utterance-wise beamformer estimation with speech recognition-level criterion

Abstract

In this paper, we perform beamforming with a speech recognition-level criterion. A beamformer is usually designed by optimizing signal-level criteria, e.g., by minimizing the beamformer output covariance or by maximizing the signal-to-noise ratio (SNR). Such signal-level criteria do not always guarantee that the optimized beamformer is the best for noise robust automatic speech recognition. Recently, a few approaches have been proposed for performing beamforming with a speech recognition-level criterion. These approaches train beamformers along with an acoustic model by using multichannel training data and a parallel corpus of noisy and clean data. This paper proposes a novel approach for estimating the beamformer for every test utterance with a speech recognition-level criterion. We use an unsupervised acoustic model adaptation scheme to optimize our beamformer. Specifically, we first obtain decoding results with an initialized beamformer, and then we optimize our beamformer using back propagation to minimize the cross entropy between the first-pass decoding results and actual network outputs. With this approach, our beamformer can be trained to discriminate hidden Markov model states more clearly for every test utterance. Experimental results show that our beamformer outperforms a beamformer designed with a signal-level criterion.

Keywords:
Computer science Speech recognition Adaptive beamformer Beamforming Hidden Markov model Decoding methods Pattern recognition (psychology) Artificial intelligence Noise (video) Algorithm Telecommunications

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
24
Refs
0.07
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.