JOURNAL ARTICLE

End-to-End Text-to-Speech for Minangkabau Pariaman Dialect Using Variational Autoencoder with Adversarial Learning (VITS)

Abstract

Language serves as a medium of human communication to convey ideas, emotions, and information, both orally and in writing. Each language possesses vocabulary and grammar adapted to the local culture. One of the regional languages that enriches Indonesian as the national language is Minangkabau. This language has four main dialects, namely Tanah Datar, Lima Puluh Kota, Agam, and Pesisir. Within the Pesisir dialect, there are several variations, including the Padang Kota, Padang Luar Kota, Painan, Tapan, and Pariaman dialects. This study discusses the application of Text-to-Speech (TTS) technology to the Minangkabau language, specifically the Pariaman dialect, using the Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech (VITS) method. This dialect needs to be preserved to prevent extinction and supported through technological development that broadens its use. The VITS method was chosen because it is capable of producing natural and high-quality speech. The research stages include voice data collection and recording, VITS model training, and speech quality evaluation using the Mean Opinion Score (MOS). The final results show a score of 4.72 out of 5, indicating that the generated speech closely resembles the natural utterances of native speakers. This TTS technology is expected to support the preservation and development of the Minangkabau language in the Pariaman dialect, as well as enhance information accessibility for its speakers.

Keywords:

Metrics

1
Cited By
4.82
FWCI (Field Weighted Citation Impact)
0
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Linguistics and Language Analysis
Social Sciences →  Arts and Humanities →  Language and Linguistics
© 2026 ScienceGate Book Chapters — All rights reserved.