M.A. KohlerW.D. AndrewsJoseph P. CampbellJ. Herndndez-Cordero
This paper introduces a novel language-independent speaker-recognition system based on differences in dynamic realization of phonetic features (i.e., pronunciation) between speakers rather than spectral differences in voice quality. The system exploits phonetic information from six languages to perform text independent speaker recognition. All experiments were performed on the NIST 2001 Speaker Recognition Evaluation Extended Data Task. Recognition results are provided for unigram, bigram, and trigram models. Performance for each of the three models is examined for phones from each individual language and the final multilanguage fused system. Additional fusion experiments demonstrate that speaker recognition capability is maintained even without phonetic information in the language of the speaker.
W.D. AndrewsM.A. KohlerJoseph P. Campbell
Tetsuo KosakaTatsuya AkatsuMasaharu KatoMasaki Koh
Piero CosiEmanuela Magno CaldognettoFranco FerreroM. DugattoK. Vagges