JOURNAL ARTICLE

Duration modeling for text to speech synthesis system using festival speech engine developed for Malayalam language

Abstract

This paper describes duration modeling in Text To Speech Synthesis (TTS) for Malayalam language using open source Festival TTS engine. Classification and Regression Tree (CART) based data-driven phoneme duration modeling is presented. A number of features are extracted for predicting the duration of phonemes. Objective evaluation test was conducted to evaluate the intelligibility of the synthesized speech by root mean squared error (RMSE) and correlation between actual and predicted durations. The objective evaluation of the model gave an RMSE of 0.1188 and a correlation of 0.9918.

Keywords:
Malayalam Computer science Speech synthesis Duration (music) Speech recognition Mean squared error Intelligibility (philosophy) Language model Correlation Artificial intelligence Natural language processing Statistics Mathematics Acoustics

Metrics

4
Cited By
0.31
FWCI (Field Weighted Citation Impact)
21
Refs
0.79
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.