JOURNAL ARTICLE

Unsupervised Speaker Verification Using Pre-Trained Model and Label Correction

Abstract

Recently, the fine-tuning pre-trained model framework has emerged as a promising paradigm for speech-processing tasks. In this study, we present a novel strategy for unsupervised speaker verification using the Sub-structure of Pre-Trained Model (Sub-PTM), which consists of a CNN-based feature extractor and several Transformer blocks. To obtain the initial pseudo labels, we utilize Infomap to perform clustering on the representations extracted from the Sub-PTM. The generated pseudo labels are then leveraged to train a speaker verification model containing a Sub-PTM and a downstream network. We also propose an Online and Offline Label Correction (OAO-LC) method to alleviate the effects of incorrect pseudo labels. By incorporating these techniques, our system achieves competitive results compared to the supervised baseline.

Keywords:
Computer science Artificial intelligence Cluster analysis Transformer Feature extraction Speech recognition Pattern recognition (psychology) Feature (linguistics) Speaker verification Extractor Speaker recognition

Metrics

9
Cited By
2.30
FWCI (Field Weighted Citation Impact)
30
Refs
0.87
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.