SR-HuBERT : An Efficient Pre-Trained Model for Speaker Verification

Yishuang Li; Hukai Huang; Zhicong Chen; Wenhao Guan; Jiayan Lin; Lin Li; Qingyang Hong

doi:10.1109/icassp48485.2024.10447606

ScienceGate Book Chapters

JOURNAL ARTICLE

SR-HuBERT : An Efficient Pre-Trained Model for Speaker Verification

Yishuang Li Hukai Huang Zhicong Chen Wenhao Guan Jiayan Lin Lin Li Qingyang Hong

Year: 2024 Pages: 11591-11595

DOI: 10.1109/icassp48485.2024.10447606

Get Full-Text PDF Get Analytical Report

Abstract

Recently, pre-trained models (PTMs) have been extensively applied in speaker verification (SV) and greatly boosted system performance. However, mainstream PTMs currently concentrate on using frame-level universal representations. In this paper, we propose a novel pre-training framework that jointly models speaker information — Speaker Related HuBERT, abbreviated as SR-HuBERT. This framework aims to further explore speaker-related information inherent in speech universal representations. The proposed SR-HuBERT utilizes an unsupervised clustering algorithm based on graph structures to generate speaker pseudo-labels and promotes the learning of segment-level speaker-related representations through a multi-task pre-training framework. Experimental results on VoxCeleb1 test set demonstrate the effectiveness of the proposed SR-HuBERT. Even in the scenarios of limited fine-tuning data, SR-HuBERT outperforms the other existing PTMs on SV tasks. Additionally, SR-HuBERT also performs well on speaker-related tasks of SUPERB benchmark.

Keywords:

Computer science Benchmark (surveying) Cluster analysis Task (project management) Speaker recognition Speaker verification Speaker diarisation Frame (networking) Artificial intelligence Speech recognition Set (abstract data type) Test set Graph Training set Natural language processing Machine learning Theoretical computer science Programming language

Metrics

Cited By

3.19

FWCI (Field Weighted Citation Impact)

Refs

0.88

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

SR-HuBERT : An Efficient Pre-Trained Model for Speaker Verification

Abstract

Metrics

Citation History

Topics

Related Documents

PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification

Efficient Integrated Features Based on Pre-trained Models for Speaker Verification

Unsupervised Speaker Verification Using Pre-Trained Model and Label Correction

Efficient Adapter Tuning of Pre-Trained Speech Models for Automatic Speaker Verification

Improving Noise Robustness in Self-supervised Pre-trained Model for Speaker Verification