Mispronunciation Detection Using Deep Convolutional Neural Network Features and Transfer Learning-Based Model for Arabic Phonemes

Faria Nazir; Muhammad Nadeem Majeed; Mustansar Ali Ghazanfar; Muazzam Maqsood

doi:10.1109/access.2019.2912648

ScienceGate Book Chapters

JOURNAL ARTICLE

Mispronunciation Detection Using Deep Convolutional Neural Network Features and Transfer Learning-Based Model for Arabic Phonemes

Faria Nazir Muhammad Nadeem Majeed Mustansar Ali Ghazanfar Muazzam Maqsood

Year: 2019 Journal: IEEE Access Vol: 7 Pages: 52589-52608 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/access.2019.2912648

Get Full-Text PDF Get Analytical Report

Abstract

Computer-assisted language learning (CALL) systems provide an automated framework to identify mispronunciation and give useful feedback. Traditionally, handcrafted acoustic-phonetic features are used to detect mispronunciation. From this line of research, this paper investigates the use of the deep convolutional neural network for mispronunciation detection of Arabic phonemes. We propose two methods with different techniques, i.e., convolutional neural network features (CNN_Features)-based technique and a transfer learning-based technique to detect mispronunciation detection. In the first method, we use deep CNN features to detect mispronunciation. We also extract features from different layers of CNN (layer4 to layer7) to train k-nearest neighbor (KNN), support vector machine (SVM), and neural network (NN) classifiers. In the transfer learning-based method, we trained the CNN using transfer learning to detect mispronunciation. To evaluate the performance of the system, we compare the results of these methods with baseline handcrafted features-based method for 28 Arabic phonemes. In the baseline method, we use the same classifiers; KNN, SVM, and NN to detect mispronunciation. The experimental results show that handcrafted_features method, CNN_features, and transfer learning-based method achieve an accuracy of 82%, 91.7%, and 92.2%, respectively. The performance analysis shows that transfer learning-based method outperforms handcrafted_features and transfer CNN_features-based methods and achieve an accuracy of 92.2%. The proposed transfer learning-based method also outperforms the state-of-art techniques in term of accuracy.

Keywords:

Computer science Transfer of learning Convolutional neural network Artificial intelligence Support vector machine Speech recognition Deep learning Pattern recognition (psychology) Artificial neural network Machine learning

Metrics

Cited By

5.09

FWCI (Field Weighted Citation Impact)

Refs

0.96

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Mispronunciation Detection Using Deep Convolutional Neural Network Features and Transfer Learning-Based Model for Arabic Phonemes

Abstract

Metrics

Citation History

Topics

Related Documents

Improving Mispronunciation Detection of Arabic Words for Non-Native Learners Using Deep Convolutional Neural Network Features

Arabic Phonemes Recognition Using Convolutional Neural Network

One-Class Convolutional Neural Network for Arabic Mispronunciation Detection

Real-Time Facemask Detection Using Deep Convolutional Neural Network-Based Transfer Learning

Deep Learning-Based Arrhythmia Detection Using Convolutional Neural Network