JOURNAL ARTICLE

Improve mispronunciation detection with Tandem feature

Abstract

This paper presents a method to improve the mispronunciation detection performance for low-resource acoustic model. The 1h speech data is randomly selected from CU-CHLOE to imitate the low-resource non-native English situation. The Tandem feature derived from articulatory based Multi-Layer Perception (MLP) is employed to replace the traditional spectral feature (e.g. PLP). Further, motivated by similar pronunciation characteristics between Chinese speaking English and Mandarin, the Mandarin speech data is used to assist in training the multilingual articulatory MLPs. The Tandem feature is also combined with PLP to improve the performance. Finally, the phone recognition correctness (CORR) is improved by 3.84%, and the diagnosis accuracy (DA) is improved by 2.25% with the proposed method.

Keywords:
Computer science Mandarin Chinese Speech recognition Pronunciation Feature (linguistics) Phone Correctness Tandem Artificial intelligence Feature extraction Pattern recognition (psychology) Engineering Linguistics

Metrics

5
Cited By
0.76
FWCI (Field Weighted Citation Impact)
17
Refs
0.78
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Phonetics and Phonology Research
Social Sciences →  Psychology →  Experimental and Cognitive Psychology

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.