JOURNAL ARTICLE

Speaker adaptation using multiple reference speakers

Abstract

We introduce a new technique for using the speech of multiple reference speakers as a basis for speaker adaptation in large vocabulary continuous speech recognition. In contrast to other methods that use a pooled reference model, this technique normalizes the training speech from multiple reference speakers to a single common feature space before pooling it. The normalized and pooled speech can then be treated as if it came from a single reference speaker for training the reference hidden Markov model (HMM). Our usual probabilistic spectrum transformation can be applied to the reference HMM to model a new (target) speaker. In this paper, we describe our baseline (single reference speaker) speaker-adaptation system and give current performance results from a recent formal evaluation of the system. We also describe our proposal for adapting from multiple reference speakers and report on recent preliminary experimental results in support of the proposed technique.

Keywords:
Computer science Hidden Markov model Speech recognition Speaker diarisation Pooling Vocabulary Adaptation (eye) Reference model Speaker recognition Artificial intelligence Pattern recognition (psychology) Linguistics

Metrics

4
Cited By
0.38
FWCI (Field Weighted Citation Impact)
5
Refs
0.72
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Time Series Analysis and Forecasting
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.