JOURNAL ARTICLE

Text-Dependent Speaker Recognition With Random Digit Strings

Themos StafylakisMd. Jahangir AlamPatrick Kenny

Year: 2016 Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Vol: 24 (7)Pages: 1194-1203   Publisher: Institute of Electrical and Electronics Engineers

Abstract

In this paper, we explore joint factor analysis (JFA) for text-dependent speaker recognition with random digit strings. The core of the proposed method is a JFA model by which we extract features. These features can either represent overall utterances or individual digits, and are fed into a trainable backend to estimate likelihood ratios. Within this framework, several extensions are proposed. First is a logistic regression method for combining log-likelihood ratios that correspond to individual mixture components. Second is the extraction of phonetically aware Baum-Welch statistics, by using forced alignment instead of the typical posterior probabilities that are derived by the universal background model. We also explore a digit-string-dependent way to apply score normalization that exhibits a notable improvement compared to the standard one. By fusing six JFA features, we attained 2.01% and 3.19% equal error rates on male and female, respectively, on the challenging RSR2015 (part III) dataset.

Keywords:
Normalization (sociology) Speech recognition Computer science String (physics) Pattern recognition (psychology) Logistic regression Numerical digit Speaker recognition Random forest Digit recognition Artificial intelligence Statistics Mathematics Machine learning Arithmetic Artificial neural network

Metrics

27
Cited By
4.51
FWCI (Field Weighted Citation Impact)
34
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.