Text-to-audio-visual speech synthesis based on parameter generation from HMM

Bogdan Sabac; Inge Gavăt; Takashi Masuko; Takao Kobayashi

doi:10.21437/eurospeech.1999-247

ScienceGate Book Chapters

JOURNAL ARTICLE

Text-to-audio-visual speech synthesis based on parameter generation from HMM

Bogdan Sabac Inge Gavăt Takashi Masuko Takao Kobayashi

Year: 1999 Pages: 959-962

DOI: 10.21437/eurospeech.1999-247

Get Full-Text PDF Get Analytical Report

Abstract

We present a new self-organizing neural network which performs unsupervised learning and can be used for vector quantization. The main advantage over existing approaches, e.g., the Kohonen feature map, is the ability of the model to automatically find a suitable network structure and size. This is achieved through a controlled growth process which also includes occasional removal of units. The algorithm is evaluated on a database that includes 25 speakers each of them recorded in 12 diffrent sesions. The overall performance was 99.5%. That is, in 99.5% of the trials, the right speaker was correctly accepted and the impostor speaker correctly rejected.

Keywords:

Computer science Vector quantization Learning vector quantization Self-organizing map Artificial neural network Quantization (signal processing) Artificial intelligence Feature (linguistics) Speech recognition Pattern recognition (psychology) Process (computing) Speaker verification Feature vector Speaker recognition Unsupervised learning Computer vision

Metrics

Cited By

0.36

FWCI (Field Weighted Citation Impact)

Refs

0.53

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Neural Networks and Applications

Physical Sciences → Computer Science → Artificial Intelligence

Text-to-audio-visual speech synthesis based on parameter generation from HMM

Abstract

Metrics

Topics

Related Documents

Text-to-visual speech synthesis based on parameter generation from HMM

HMM-based text-to-audio-visual speech synthesis

Coarticulation method for audio-visual text-to-speech synthesis

Emotion Audio-Visual Text-To-Speech

Humanoid Audio–Visual Avatar With Emotive Text-to-Speech Synthesis