JOURNAL ARTICLE

Systematic review of feature-based approaches to mispronunciation detection

Abstract

Accurate pronunciation is essential for successful communication in a second language (L2) as it significantly influences communicative effectiveness and perceived fluency. Mispronunciations frequently arise due to the influence of the learner’s first language (L1), posing barriers to effective spoken communication. Therefore, pronunciation error detection (PED) has emerged as a critical research area within the domains of Computer-Assisted Language Learning (CALL) and Computer-Assisted Pronunciation Training (CAPT). Although numerous PED systems have been developed over recent decades, existing survey papers have mainly emphasized comparisons of modeling methodologies or learning paradigms, often neglecting the critical role of feature representation. To address this research gap, this survey introduces a novel, feature-based taxonomy for categorizing PED methodologies into four primary groups: Acoustic-based, Acoustic-Phonetic, Linguistic-based, and Hybrid approaches. Each category is systematically reviewed, summarizing over two decades of research work with respect to feature extraction techniques, modeling approaches, evaluation metrics, and the nature and quality of instructional feedback provided to learners. A detailed comparative analysis highlights significant trade-offs among these categories in terms of detection accuracy, interpretability, resource demands, and applicability in real-time or low-resource contexts. Furthermore, this survey discusses recent and emerging trends in PED research, including self-supervised learning frameworks, multimodal feature fusion, and integrating phonological knowledge with modern deep learning architectures. By synthesizing existing knowledge and identifying gaps in current methodologies, this paper aims to provide clear insights and directions for future advancements in PED systems.

Keywords:
Pronunciation Feature (linguistics) Resource (disambiguation) Taxonomy (biology) Quality (philosophy) Feature extraction Spoken language

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Phonetics and Phonology Research
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Voice and Speech Disorders
Health Sciences →  Medicine →  Physiology

Related Documents

BOOK-CHAPTER

Mispronunciation Detection Using Feature Learning

Priyanka ChhabraShailja ChhillarRiya TanwarMuskan VermaGaurav Indra

Lecture notes in networks and systems Year: 2024 Pages: 307-316
JOURNAL ARTICLE

Feature-Based Fault Detection Approaches

Markus ÖzbekDirk Söffker

Year: 2006 Vol: 11 Pages: 342-347
JOURNAL ARTICLE

Mispronunciation Detection

Wilma van Donselaar

Journal:   Language and Cognitive Processes Year: 1996 Vol: 11 (6)Pages: 621-628
© 2026 ScienceGate Book Chapters — All rights reserved.