JOURNAL ARTICLE

Kurdish end-to-end speech synthesis using deep neural networks

Sabat Salih MuhamadHadi VeisiAso MahmudiAbdulhady Abas AbdullahFarhad Rahimi

Year: 2024 Journal:   Natural Language Processing Journal Vol: 8 Pages: 100096-100096   Publisher: Elsevier BV

Abstract

This article introduces an end-to-end text-to-speech (TTS) system for the low-resourced language of Central Kurdish (CK, also known as Sorani) and tackles the challenges associated with limited data availability. We have compiled a dataset suitable for end-to-end text-to-speech that includes 21 h of CK female voice paired with corresponding texts. To identify the optimal performing system, we employed Tacotron2, an end-to-end deep neural network for speech synthesis, in three training experiments. The process involves training Tacotron2 using a pre-trained English system, followed by training two models from scratch with full and intonationally balanced datasets. We evaluated the effectiveness of these models using Mean Opinion Score (MOS), a subjective evaluation metric. Our findings demonstrate that the model trained from scratch on the full CK dataset surpasses both the model trained with the intonationally balanced dataset and the model trained using a pre-trained English model in terms of naturalness and intelligibility by achieving a MOS of 4.78 out of 5.

Keywords:
End-to-end principle Naturalness Computer science Mean opinion score Artificial neural network Speech synthesis Scratch Deep neural networks Speech recognition Intelligibility (philosophy) Metric (unit) Artificial intelligence Language model Pronunciation Natural language processing Linguistics Engineering

Metrics

3
Cited By
1.92
FWCI (Field Weighted Citation Impact)
51
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

DISSERTATION

Deep Neural Networks for End-to-End Optimized Speech Coding

Srihari Kankanahalli

University:   University Libraries (University of Maryland) Year: 2017
JOURNAL ARTICLE

Towards an end-to-end speech recognizer for Portuguese using deep neural networks

gor QuintanilhaLuiz W. P. BiscainhoSérgio L. Netto

Journal:   Anais de XXXV Simpósio Brasileiro de Telecomunicações e Processamento de Sinais Year: 2017
JOURNAL ARTICLE

End-to-End Kurdish Speech Synthesis Based on Transfer Learning

Sabat Salih MuhamadHadi Veisi

Journal:   passer Year: 2022 Vol: 4 (2)Pages: 150-160
JOURNAL ARTICLE

End-to-End Kurdish Speech Synthesis Based on Transfer Learning

Sabat Salih MuhamadHadi Veisi

Journal:   SSRN Electronic Journal Year: 2024
© 2026 ScienceGate Book Chapters — All rights reserved.