JOURNAL ARTICLE

Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting

Brian MakRoger HsiaoSimon HoJames T. Kwok

Year: 2006 Journal:   IEEE Transactions on Audio Speech and Language Processing Vol: 14 (4)Pages: 1267-1280   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Recently, we proposed an improvement to the conventional eigenvoice (EV) speaker adaptation using kernel methods. In our novel kernel eigenvoice (KEV) speaker adaptation, speaker supervectors are mapped to a kernel-induced high dimensional feature space, where eigenvoices are computed using kernel principal component analysis. A new speaker model is then constructed as a linear combination of the leading eigenvoices in the kernel-induced feature space. KEV adaptation was shown to outperform EV, MAP, and MLLR adaptation in a TIDIGITS task with less than 10 s of adaptation speech. Nonetheless, due to many kernel evaluations, both adaptation and subsequent recognition in KEV adaptation are considerably slower than conventional EV adaptation. In this paper, we solve the efficiency problem and eliminate all kernel evaluations involving adaptation or testing observations by finding an approximate pre-image of the implicit adapted model found by KEV adaptation in the feature space; we call our new method embedded kernel eigenvoice (eKEV) adaptation. eKEV adaptation is faster than KEV adaptation, and subsequent recognition runs as fast as normal HMM decoding. eKEV adaptation makes use of multidimensional scaling technique so that the resulting adapted model lies in the span of a subset of carefully chosen training speakers. It is related to the reference speaker weighting (RSW) adaptation method that is based, on speaker clustering. Our experimental results on Wall Street Journal show that eKFV adaptation continues to outperform EV, MAP, MLLR, and the original RSW method. However, by adopting the way we choose the subset of reference speakers for eKEV adaptation, we may also improve RSW adaptation so that it performs as well as our eKEV adaptation.

Keywords:
Adaptation (eye) Weighting Speech recognition Computer science Kernel (algebra) Pattern recognition (psychology) Speaker diarisation Artificial intelligence Feature (linguistics) Speaker recognition Mathematics

Metrics

19
Cited By
3.14
FWCI (Field Weighted Citation Impact)
44
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Kernel eigenvoice speaker adaptation

Brian MakJames T. KwokSimon Ho

Journal:   IEEE Transactions on Speech and Audio Processing Year: 2005 Vol: 13 (5)Pages: 984-992
BOOK-CHAPTER

Unsupervised Speaker Adaptation Using Reference Speaker Weighting

Tsz-Chung LaiBrian Mak

Lecture notes in computer science Year: 2006 Pages: 380-389
© 2026 ScienceGate Book Chapters — All rights reserved.