JOURNAL ARTICLE

Speaker adaptation of an acoustic-articulatory inversion model using cascaded Gaussian mixture regressions

Abstract

The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation training. This system provides a speaker with visual information about his/her own articulation, via a 3D orofacial clone. In previous work, we proposed to use GMM-based voice conversion for speaker adaptation. Acoustic-articulatory mapping was achieved in 2 consecutive steps: 1) converting the spectral trajectories of the target speaker (i.e. the system user) into spectral trajectories of the reference speaker (voice conversion), and 2) estimating the most likely articulatory trajectories of the reference speaker from the converted spectral features (acoustic-articulatory inversion). In this work, we propose to combine these two steps into the same statistical mapping framework, by fusing multiple regressions based on trajectory GMM and maximum likelihood criterion (MLE). The proposed technique is compared to two standard speaker adaptation techniques based respectively on MAP and MLLR.

Keywords:
Computer science Speech recognition Mixture model Hidden Markov model Speaker recognition Speaker diarisation Formant Maximum a posteriori estimation Pronunciation Acoustic space Inversion (geology) Context (archaeology) Pattern recognition (psychology) Artificial intelligence Maximum likelihood Vowel Mathematics Acoustics

Metrics

19
Cited By
1.77
FWCI (Field Weighted Citation Impact)
22
Refs
0.85
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Speaker-Adaptive Acoustic-Articulatory Inversion Using Cascaded Gaussian Mixture Regression

Thomas HueberLaurent GirinXavier Alameda-PinedaGérard Bailly

Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Year: 2015 Vol: 23 (12)Pages: 2246-2259
JOURNAL ARTICLE

Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping

Laurent GirinThomas HueberXavier Alameda-Pineda

Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Year: 2017 Vol: 25 (3)Pages: 662-673
JOURNAL ARTICLE

Unsupervised speaker adaptation for speaker independent acoustic to articulatory speech inversion

Ganesh SivaramanVikramjit MitraHosung NamMark TiedeCarol Espy-Wilson

Journal:   The Journal of the Acoustical Society of America Year: 2019 Vol: 146 (1)Pages: 316-329
JOURNAL ARTICLE

On smoothing articulatory trajectories obtained from Gaussian mixture model based acoustic-to-articulatory inversion

Prasanta GhoshShrikanth Narayanan

Journal:   The Journal of the Acoustical Society of America Year: 2013 Vol: 134 (2)Pages: EL258-EL264
© 2026 ScienceGate Book Chapters — All rights reserved.