Speaker-Independent Acoustic-to-Articulatory Speech Inversion

Peter Wu; Liwei Chen; Cheol Jun Cho; Shinji Watanabe; Louis Goldstein; Alan W. Black; Gopala K. Anumanchipalli

doi:10.1109/icassp49357.2023.10096796

ScienceGate Book Chapters

JOURNAL ARTICLE

Speaker-Independent Acoustic-to-Articulatory Speech Inversion

Peter Wu Liwei Chen Cheol Jun Cho Shinji Watanabe Louis Goldstein Alan W. Black Gopala K. Anumanchipalli

Year: 2023 Pages: 1-5

DOI: 10.1109/icassp49357.2023.10096796

Get Full-Text PDF Get Analytical Report

Abstract

To build speech processing methods that can handle speech as naturally as humans, researchers have explored multiple ways of building an invertible mapping from speech to an interpretable space. The articulatory space is a promising inversion target, since this space captures the mechanics of speech production. To this end, we build an acoustic-to-articulatory inversion (AAI) model that leverages autoregression, adversarial training, and self supervision to generalize to unseen speakers. Our approach obtains 0.784 correlation on an electromagnetic articulography (EMA) dataset, improving the state-of-the-art by 12.5%. Additionally, we show the interpretability of these representations through directly com-paring the behavior of estimated representations with speech production behavior. Finally, we propose a resynthesis-based AAI evaluation metric that does not rely on articulatory labels, demonstrating its efficacy with an 18-speaker dataset.

Keywords:

Interpretability Speech production Computer science Speech recognition Inversion (geology) Speech processing Metric (unit) Autoregressive model Acoustic space Artificial intelligence Natural language processing Mathematics Acoustics

Metrics

Cited By

4.09

FWCI (Field Weighted Citation Impact)

Refs

0.93

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Phonetics and Phonology Research

Social Sciences → Psychology → Experimental and Cognitive Psychology

Speaker-Independent Acoustic-to-Articulatory Speech Inversion

Abstract

Metrics

Citation History

Topics

Related Documents

Unsupervised speaker adaptation for speaker independent acoustic to articulatory speech inversion

Vocal Tract Length Normalization for Speaker Independent Acoustic-to-Articulatory Speech Inversion

Autoregressive Articulatory WaveNet Flow for Speaker-Independent Acoustic-to-Articulatory Inversion

Reference speaker selection for kinematic-independent acoustic-to-articulatory-inversion

Two-Stream Joint-Training for Speaker Independent Acoustic-to-Articulatory Inversion