Speaker-Phonetic I-Vector Modeling for Text-Dependent Speaker Verification with Random Digit Strings

Shengyu Yao; Ruohua Zhou; Pengyuan Zhang

doi:10.1587/transinf.2018edp7310

ScienceGate Book Chapters

JOURNAL ARTICLE

Speaker-Phonetic I-Vector Modeling for Text-Dependent Speaker Verification with Random Digit Strings

Shengyu Yao Ruohua Zhou Pengyuan Zhang

Year: 2019 Journal: IEICE Transactions on Information and Systems Vol: E102.D (2)Pages: 346-354 Publisher: Institute of Electronics, Information and Communication Engineers

DOI: 10.1587/transinf.2018edp7310

Get Full-Text PDF Get Analytical Report

Abstract

This paper proposes a speaker-phonetic i-vector modeling method for text-dependent speaker verification with random digit strings, in which enrollment and test utterances are not of the same phrase. The core of the proposed method is making use of digit alignment information in i-vector framework. By utilizing force alignment information, verification scores of the testing trials can be computed in the fixed-phrase situation, in which the compared speech segments between the enrollment and test utterances are of the same phonetic content. Specifically, utterances are segmented into digits, then a unique phonetically-constrained i-vector extractor is applied to obtain speaker and channel variability representation for every digit segment. Probabilistic linear discriminant analysis (PLDA) and s-norm are subsequently used for channel compensation and score normalization respectively. The final score is obtained by combing the digit scores, which are computed by scoring individual digit segments of the test utterance against the corresponding ones of the enrollment. Experimental results on the Part 3 of Robust Speaker Recognition (RSR2015) database demonstrate that the proposed approach significantly outperforms GMM-UBM by 52.3% and 53.5% relative in equal error rate (EER) for male and female respectively.

Keywords:

Speaker verification Computer science Speech recognition Numerical digit Speaker diarisation Speaker recognition Artificial intelligence Natural language processing Arithmetic Mathematics

Metrics

Cited By

0.46

FWCI (Field Weighted Citation Impact)

Refs

0.69

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speaker-Phonetic I-Vector Modeling for Text-Dependent Speaker Verification with Random Digit Strings

Abstract

Metrics

Citation History

Topics

Related Documents

Text-Dependent Speaker Recognition With Random Digit Strings

Double Joint Bayesian Modeling of DNN Local I-Vector for Text Dependent Speaker Verification with Random Digit Strings

Digit-dependent local i-vector for text-prompted speaker verification with random digit sequences

Speaker Vector-Based Speaker Recognition with Phonetic Modeling

JFA for speaker recognition with random digit strings