JOURNAL ARTICLE

Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network

Mohammed Sidi YakoubSid‐Ahmed SelouaniBrahim-Fares ZaidiAsma Bouchair

Year: 2020 Journal:   EURASIP Journal on Audio Speech and Music Processing Vol: 2020 (1)   Publisher: Springer Nature

Abstract

Abstract In this paper, we use empirical mode decomposition and Hurst-based mode selection (EMDH) along with deep learning architecture using a convolutional neural network (CNN) to improve the recognition of dysarthric speech. The EMDH speech enhancement technique is used as a preprocessing step to improve the quality of dysarthric speech. Then, the Mel-frequency cepstral coefficients are extracted from the speech processed by EMDH to be used as input features to a CNN-based recognizer. The effectiveness of the proposed EMDH-CNN approach is demonstrated by the results obtained on the Nemours corpus of dysarthric speech. Compared to baseline systems that use Hidden Markov with Gaussian Mixture Models (HMM-GMMs) and a CNN without an enhancement module, the EMDH-CNN system increases the overall accuracy by 20.72% and 9.95%, respectively, using a k -fold cross-validation experimental setup.

Keywords:
Speech recognition Computer science Convolutional neural network Hidden Markov model Artificial intelligence Mixture model Preprocessor Pattern recognition (psychology) Mel-frequency cepstrum Feature extraction

Metrics

58
Cited By
5.47
FWCI (Field Weighted Citation Impact)
17
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Voice and Speech Disorders
Health Sciences →  Medicine →  Physiology
© 2026 ScienceGate Book Chapters — All rights reserved.