DISSERTATION

Low-resource speech recognition using pre-trained speech representation models

Abstract

Difficulties in eliciting substantial spoken data from speaker populations of interest and producing the accompanying transcripts result in low-resource scenarios in which the development of robust automatic speech recognition (ASR) systems may be hindered. With the aid of a large volume of unlabeled audio data, self-supervised speech representation learning may address this limitation by learning a model-based feature extractor via a proxy task in advance, thus offering pre-trained representations transferable to the ASR task for fine-tuning. This dissertation reviews current self-supervised speech representation learning methodologies and investigates the application of wav2vec 2.0 ASR on a developing corpus named CU-MARVEL in order to provide automatic transcripts for streamlining it...[ Read more ]

Keywords:
Computer science Speech recognition Extractor Task (project management) Representation (politics) Feature learning Artificial intelligence Natural language processing Feature (linguistics) Speech analytics Labeled data Acoustic model Feature extraction Speaker recognition Speech processing Engineering Linguistics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.