JOURNAL ARTICLE

Mispronunciation Detection Using Deep Convolutional Neural Network Features and Transfer Learning-Based Model for Arabic Phonemes

Faria NazirMuhammad Nadeem MajeedMustansar Ali GhazanfarMuazzam Maqsood

Year: 2019 Journal:   IEEE Access Vol: 7 Pages: 52589-52608   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Computer-assisted language learning (CALL) systems provide an automated framework to identify mispronunciation and give useful feedback. Traditionally, handcrafted acoustic-phonetic features are used to detect mispronunciation. From this line of research, this paper investigates the use of the deep convolutional neural network for mispronunciation detection of Arabic phonemes. We propose two methods with different techniques, i.e., convolutional neural network features (CNN_Features)-based technique and a transfer learning-based technique to detect mispronunciation detection. In the first method, we use deep CNN features to detect mispronunciation. We also extract features from different layers of CNN (layer4 to layer7) to train k-nearest neighbor (KNN), support vector machine (SVM), and neural network (NN) classifiers. In the transfer learning-based method, we trained the CNN using transfer learning to detect mispronunciation. To evaluate the performance of the system, we compare the results of these methods with baseline handcrafted features-based method for 28 Arabic phonemes. In the baseline method, we use the same classifiers; KNN, SVM, and NN to detect mispronunciation. The experimental results show that handcrafted_features method, CNN_features, and transfer learning-based method achieve an accuracy of 82%, 91.7%, and 92.2%, respectively. The performance analysis shows that transfer learning-based method outperforms handcrafted_features and transfer CNN_features-based methods and achieve an accuracy of 92.2%. The proposed transfer learning-based method also outperforms the state-of-art techniques in term of accuracy.

Keywords:
Computer science Transfer of learning Convolutional neural network Artificial intelligence Support vector machine Speech recognition Deep learning Pattern recognition (psychology) Artificial neural network Machine learning

Metrics

49
Cited By
5.09
FWCI (Field Weighted Citation Impact)
57
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.