Aerodynamic modeling for concatenative speech synthesis.

Kevin B. McGowan

doi:10.1121/1.3248882

ScienceGate Book Chapters

JOURNAL ARTICLE

Aerodynamic modeling for concatenative speech synthesis.

Kevin B. McGowan

Year: 2009 Journal: The Journal of the Acoustical Society of America Vol: 126 (4_Supplement)Pages: 2222-2222 Publisher: Acoustical Society of America

DOI: 10.1121/1.3248882

Get Full-Text PDF Get Analytical Report

Abstract

Listeners can perceive and use a wide array of fine-grained phonetic details, including the detailed coarticulatory influences of adjacent sounds, when perceiving speech. Details like anticipatory nasalization in can, for example, potentially provide the listener with a rich network of informative cues and are a key to understanding listeners’ ability to disambiguate speech sounds from seemingly ambiguous input. Unfortunately, these coarticulatory cues are generally missing or contradictory in the output of speech synthesis systems. These systems work by concatenating variable-length sound units chosen from a large database of recorded speech. Units are chosen to minimize two functions: the cost of aligning a particular unit with the desired speech output (target cost) and the cost of adjoining the next sound to the most recently selected unit (join cost). Generally, these costs are calculated using features which can be automatically extracted from the acoustic speech signal. A unit selection database is created, automatically segmented and automatically labeled with nasal and oral airflow feature vectors. These aerodynamic features are used as a proxy for articulatory information in the calculation of join and cost functions. Listeners’ mean opinion scores are obtained on output from this system and a baseline acoustic system for comparison.

Keywords:

Computer science Speech recognition Speech synthesis Feature (linguistics)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.07

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Phonetics and Phonology Research

Social Sciences → Psychology → Experimental and Cognitive Psychology

Aerodynamic modeling for concatenative speech synthesis.

Abstract

Metrics

Topics

Related Documents

Feature-domain concatenative speech synthesis

Multi-lingual concatenative speech synthesis

Limitations to concatenative speech synthesis

Refined speech segmentation for concatenative speech synthesis

High Quality Arabic Concatenative Speech Synthesis