Multi-Task Based Mispronunciation Detection of Children Speech Using Multi-Lingual Information

Linxuan Wei; Wenwei Dong; Binghuai Lin; Jinsong Zhang

doi:10.1109/apsipaasc47483.2019.9023351

ScienceGate Book Chapters

JOURNAL ARTICLE

Multi-Task Based Mispronunciation Detection of Children Speech Using Multi-Lingual Information

Linxuan Wei Wenwei Dong Binghuai Lin Jinsong Zhang

Year: 2019

DOI: 10.1109/apsipaasc47483.2019.9023351

Get Full-Text PDF Get Analytical Report

Abstract

In developing a Computer-Aided Pronunciation Training (CAPT) system for Chinese ESL (English as a Second Language) children, we suffered from insufficient task-specific data. To address this issue, we propose to utilize first language (L1) and second language (L2) knowledge from both adult and children data through multitask-based transfer learning according to Speech Learning Model (SLM). Experimental set-up includes the TDNN acoustic modelling using the following training data: 70 hours of English speech by American Children (AC), 100 hours by American Adults (AA), 5 hours of Chinese speech by Chinese Children (CC), and 89 hours by Chinese Adults (CA). Testing data includes 2 hours of ESL speech by Chinese children. Experimental results showed that the inclusion of AA data brought about 13% relative Detection Error Rate (DER) reduction compared to AC only. Further inclusion of CC and CA data through L1 transfer learning brought about a total of 21% relative improvement in DER. These results suggested the proposed method is effective in mitigating insufficient data problem.

Keywords:

Pronunciation Computer science Task (project management) Speech recognition Transfer of learning Training set Set (abstract data type) Task analysis Speech processing Artificial intelligence Natural language processing Linguistics

Metrics

Cited By

0.15

FWCI (Field Weighted Citation Impact)

Refs

0.61

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Phonetics and Phonology Research

Social Sciences → Psychology → Experimental and Cognitive Psychology

Multi-Task Based Mispronunciation Detection of Children Speech Using Multi-Lingual Information

Abstract

Metrics

Citation History

Topics

Related Documents

Multi-Lingual Multi-Task Speech Emotion Recognition Using wav2vec 2.0

Multi-Task Learning for Mispronunciation Detection on Singapore Children’s Mandarin Speech

Multi-View Multi-Task Representation Learning for Mispronunciation Detection

Phonological Feature Based Mispronunciation Detection and Diagnosis Using Multi-Task DNNs and Active Learning

Pronunciation error detection using DNN articulatory model based on multi-lingual and multi-task learning