JOURNAL ARTICLE

EVASS: Emotional Variational End-to-End Speech Synthesis with Semi-Supervised and Adverserial Learning

Mohamed Osman

Year: 2022 Journal:   2022 2nd International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC) Vol: 30 Pages: 97-103

Abstract

Communicating one's inner state - their emotions and feelings - forms one of the core principles of social communication and behavior in humans. Emotion is an important component of speech, and its inclusion in synthetic speech will allow for breakthroughs in applications like human-machine interfacing, e-book reading, and voice acting. However, modelling emotions in speech in an end-to-end manner has so far remained an under-explored topic of research. To address this, we experiment with novel methods in global emotional modelling in unsupervised, semi-supervised and adverserial contexts using an end-to-end text-to-speech (TTS) architecture. We condition the latent space, duration prediction and audio generation on novel hybrid labels based on ground truth data – 14 emotion labels, 64 sentiment analysis labels, and speaker labels - which may be inferred from input text during inference. Experiments on conditional discriminators were also performed. The final proposed model produces high quality expressive results comparable to the state of the art.

Keywords:
Computer science End-to-end principle Speech recognition Inference Component (thermodynamics) Speech synthesis Artificial intelligence Reading (process) Interfacing Natural language processing Linguistics

Metrics

1
Cited By
0.12
FWCI (Field Weighted Citation Impact)
57
Refs
0.25
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.