A comparison of supervised and unsupervised cross-lingual speaker adaptation approaches for HMM-based speech synthesis

Hui Liang; John Dines; Lakshmi Babu Saheer

doi:10.1109/icassp.2010.5495559

ScienceGate Book Chapters

JOURNAL ARTICLE

A comparison of supervised and unsupervised cross-lingual speaker adaptation approaches for HMM-based speech synthesis

Hui Liang John Dines Lakshmi Babu Saheer

Year: 2010 Pages: 4598-4601

DOI: 10.1109/icassp.2010.5495559

Get Full-Text PDF Get Analytical Report

Abstract

The EMIME project aims to build a personalized speech-to-speech translator, such that spoken input of a user in one language is used to produce spoken output that still sounds like the user's voice however in another language. This distinctiveness makes unsupervised cross-lingual speaker adaptation one key to the project's success. So far, research has been conducted into unsupervised and cross-lingual cases separately by means of decision tree marginalization and HMM state mapping respectively. In this paper we combine the two techniques to perform unsupervised cross-lingual speaker adaptation. The performance of eight speaker adaptation systems (supervised vs. unsupervised, intra-lingual vs. cross-lingual) are compared using objective and subjective evaluations. Experimental results show the performance of unsupervised cross-lingual speaker adaptation is comparable to that of the supervised case in terms of spectrum adaptation in the EMIME scenario, even though automatically obtained transcriptions have a very high phoneme error rate.

Keywords:

Computer science Hidden Markov model Speech recognition Adaptation (eye) Speaker diarisation Artificial intelligence Natural language processing Word error rate Decision tree Unsupervised learning Optimal distinctiveness theory Speaker recognition Psychology

Metrics

Cited By

7.21

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and dialogue systems

Physical Sciences → Computer Science → Artificial Intelligence

A comparison of supervised and unsupervised cross-lingual speaker adaptation approaches for HMM-based speech synthesis

Abstract

Metrics

Citation History

Topics

Related Documents

Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis

Cross-Lingual Speaker Adaptation for HMM-based Speech Synthesis

Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis

Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis

Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction