JOURNAL ARTICLE

Towards End-to-End Speech Recognition System for Pashto Language Using Transformer Model

Munazza SherNasir AhmadMadiha Sher

Year: 2024 Journal:   International Journal of Innovations in Science and Technology Pages: 115-131

Abstract

The conventional use of Hidden Markov Models (HMMs), and Gaussian Mixture Models (GMMs)for speech recognition posed setup challenges and inefficiency. This paper adopts the Transformer model for Pashto continuous speech recognition, offering an End-to-End (E2E) system that directly represents acoustic signals in the label sequence, simplifying implementation. This study introduces a Transformer model leveraging its state-of-the-art capabilities, including parallelization and self-attention mechanisms. With limited data for Pashto, the Transformer is chosen for its proficiency in handling constraints. The objective is to develop an accurate Pashto speech recognition system. Through 200 hours of conversational data, the study achieves a Word Error Rate (WER) of up to 51% and a Character Error Rate (CER) of up to 29%. The model's parameters are fine-tuned, and the dataset size increased, leading to significant improvements. Results demonstrate the Transformer's effectiveness, showcasing its prowess in limited data scenarios. The study attains notable WER and CER metrics, affirming the model's ability to recognize Pashto speech accurately. In conclusion, the study establishes the Transformer as a robust choice for Pashto speech recognition, emphasizing its adaptability to limited data conditions. It fills a gap in ASR research for the Pashto language, contributing to the advancement of speech recognition technology in under-resourced languages. The study highlights the potential for further improvement with increased training data. The findings underscore the importance of fine-tuning and dataset augmentation in enhancing model performance and reducing error rates.

Keywords:
Hidden Markov model Transformer Word error rate Language model Acoustic model Training set Adaptability Mixture model

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.61
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Geochemistry and Geologic Mapping
Physical Sciences →  Computer Science →  Artificial Intelligence
Geological Modeling and Analysis
Physical Sciences →  Earth and Planetary Sciences →  Geochemistry and Petrology
Electrical and Electromagnetic Research
Physical Sciences →  Physics and Astronomy →  Atomic and Molecular Physics, and Optics
© 2026 ScienceGate Book Chapters — All rights reserved.