Text-Dependent Speaker Recognition With Random Digit Strings

Themos Stafylakis; Md. Jahangir Alam; Patrick Kenny

doi:10.1109/taslp.2016.2546458

ScienceGate Book Chapters

JOURNAL ARTICLE

Text-Dependent Speaker Recognition With Random Digit Strings

Themos Stafylakis Md. Jahangir Alam Patrick Kenny

Year: 2016 Journal: IEEE/ACM Transactions on Audio Speech and Language Processing Vol: 24 (7)Pages: 1194-1203 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/taslp.2016.2546458

Get Full-Text PDF Get Analytical Report

Abstract

In this paper, we explore joint factor analysis (JFA) for text-dependent speaker recognition with random digit strings. The core of the proposed method is a JFA model by which we extract features. These features can either represent overall utterances or individual digits, and are fed into a trainable backend to estimate likelihood ratios. Within this framework, several extensions are proposed. First is a logistic regression method for combining log-likelihood ratios that correspond to individual mixture components. Second is the extraction of phonetically aware Baum-Welch statistics, by using forced alignment instead of the typical posterior probabilities that are derived by the universal background model. We also explore a digit-string-dependent way to apply score normalization that exhibits a notable improvement compared to the standard one. By fusing six JFA features, we attained 2.01% and 3.19% equal error rates on male and female, respectively, on the challenging RSR2015 (part III) dataset.

Keywords:

Normalization (sociology) Speech recognition Computer science String (physics) Pattern recognition (psychology) Logistic regression Numerical digit Speaker recognition Random forest Digit recognition Artificial intelligence Statistics Mathematics Machine learning Arithmetic Artificial neural network

Metrics

Cited By

4.51

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Text-Dependent Speaker Recognition With Random Digit Strings

Abstract

Metrics

Citation History

Topics

Related Documents

JFA for speaker recognition with random digit strings

Speaker-Phonetic I-Vector Modeling for Text-Dependent Speaker Verification with Random Digit Strings

Adversarially Learned Total Variability Embedding for Speaker Recognition with Random Digit Strings

Speaker Recognition With Random Digit Strings Using Uncertainty Normalized HMM-Based i-Vectors

Digit-dependent local i-vector for text-prompted speaker verification with random digit sequences