Pseudo-Labeling for Massively Multilingual Speech Recognition

Loren Lugosch; Tatiana Likhomanenko; Gabriel Synnaeve; Ronan Collobert

doi:10.1109/icassp43922.2022.9746832

ScienceGate Book Chapters

JOURNAL ARTICLE

Pseudo-Labeling for Massively Multilingual Speech Recognition

Loren Lugosch Tatiana Likhomanenko Gabriel Synnaeve Ronan Collobert

Year: 2022 Journal: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pages: 7687-7691

DOI: 10.1109/icassp43922.2022.9746832

Get Full-Text PDF Get Analytical Report

Abstract

Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems. In this work, we extend pseudo-labeling to massively multilingual speech recognition with 60 languages. We propose a simple pseudo-labeling recipe that works well even with low-resource languages: train a supervised multilingual model, fine-tune it with semi-supervised learning on a target language, generate pseudo-labels for that language, and train a final model using pseudo-labels for all languages, either from scratch or by fine-tuning. Experiments on the labeled Common Voice and unlabeled VoxPopuli datasets show that our recipe can yield a model with better performance for many languages that also transfers well to LibriSpeech.

Keywords:

Computer science Natural language processing Speech recognition Artificial intelligence Language model

Metrics

Cited By

1.76

FWCI (Field Weighted Citation Impact)

Refs

0.85

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Pseudo-Labeling for Massively Multilingual Speech Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Iterative Pseudo-Labeling for Speech Recognition

Iterative pseudo-labeling methods for improving speech recognition

Unsupervised Speech Recognition via Utterance-wise Pseudo-labeling

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition