Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation

Hermann, Enno; Magimai.-Doss, Mathew

doi:10.5281/zenodo.8092572

ScienceGate Book Chapters

JOURNAL ARTICLE

Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation

Hermann, Enno Magimai.-Doss, Mathew

Year: 2023 Journal: Zenodo (CERN European Organization for Nuclear Research) Publisher: European Organization for Nuclear Research

DOI: 10.5281/zenodo.8092572

Get Full-Text PDF Get Analytical Report

Abstract

Speakers with dysarthria could particularly benefit from assistive speech technology, but are underserved by current automatic speech recognition (ASR) systems. The differences of dysarthric speech pose challenges, while recording large amounts of training data can be exhausting for patients. In this paper, we synthesise dysarthric speech with a FastSpeech~2-based multi-speaker text-to-speech (TTS) system for ASR data augmentation. We evaluate its few-shot capability by generating dysarthric speech with as few as 5~words from an unseen target speaker and then using it to train speaker-dependent ASR systems. The results indicated that, while the TTS output is not yet of sufficient quality, this could allow easy development of personalised acoustic models for new dysarthric speakers and domains in the future.

Keywords:

Dysarthria Training set Speech processing Articulation (sociology) Phonetics Voice activity detection

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.39

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Voice and Speech Disorders

Health Sciences → Medicine → Physiology

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Phonocardiography and Auscultation Techniques

Health Sciences → Medicine → Pulmonary and Respiratory Medicine

Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation

Abstract

Metrics

Topics

Related Documents

Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation

Training Data Augmentation for Dysarthric Automatic Speech Recognition by Text-to-Dysarthric-Speech Synthesis

Data Augmentation for Dysarthric Speech Recognition Based on Text-to-Speech Synthesis

Data Augmentation Using Healthy Speech for Dysarthric Speech Recognition

Pathology-Aware Speech Encoding and Data Augmentation for Dysarthric Speech Recognition