CNN-based Text-independent Automatic Speaker Identification Using Short Utterances

Mandana Fasounaki; Emirhan Burak Yüce; Serkan Oncul; Gökhan İnce

doi:10.1109/ubmk52708.2021.9559031

ScienceGate Book Chapters

JOURNAL ARTICLE

CNN-based Text-independent Automatic Speaker Identification Using Short Utterances

Mandana Fasounaki Emirhan Burak Yüce Serkan Oncul Gökhan İnce

Year: 2021 Journal: 2021 6th International Conference on Computer Science and Engineering (UBMK) Pages: 413-418

DOI: 10.1109/ubmk52708.2021.9559031

Get Full-Text PDF Get Analytical Report

Abstract

With the widespread use of voice-controlling services and devices, the research for developing robust and fast systems for automatic speaker identification had accelerated. In this paper, we present a Convolutional Neural Network (CNN) architecture for text-independent automatic speaker identification. The primary purpose is to identify a speaker, among many others, using a short speech segment. Most of the current researches focus on deep CNNs, which were initially designed for computer vision tasks. Besides, most of the existing speaker identification methods require audio samples longer than 3 seconds in the query phase for achieving a high accuracy. We created a CNN architecture appropriate for voice and speech-related classification tasks. We propose an optimum model that achieves 99.5% accuracy on LibriSpeech and 90% accuracy on VoxCeleb 1 dataset using only 1-second test utterances in our experiments.

Keywords:

Computer science Convolutional neural network Speech recognition Focus (optics) Identification (biology) Speaker recognition Speaker identification Speaker diarisation Artificial intelligence

Metrics

Cited By

1.35

FWCI (Field Weighted Citation Impact)

Refs

0.85

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

CNN-based Text-independent Automatic Speaker Identification Using Short Utterances

Abstract

Metrics

Citation History

Topics

Related Documents

Text-independent speaker identification from short utterances based on piecewise discriminant analysis

Text-independent speaker recognition with short utterances

An End-to-End Text-Independent Speaker Identification System on Short Utterances

Text independent speaker identification using automatic acoustic segmentation

Speaker identification using utterances correspond to speaker-specific-text