Mel-frequency Cepstral coefficients (MFCC) are popular features extracted from speech data for speaker identification. The speech signal is fragmented into frames and the MFCC features extracted from each frame show some temporal redundancy which forms the basis of the fuzzy classifier proposed in this paper. We propose a fuzzy nearest neighbor classifier that defines a frame prototype for each training audio sample using a weighted mean technique with the weights being probability values, and the class label for each test sample is decided from fuzzy membership functions involving the frame prototypes. The classification results of the proposed classifier on audio samples from the VidTIMIT database show a superior performance to the Nearest Neighbor classifier, GMM, HMM and MLP neural networks. It is observed that the execution time of the fuzzy classifier is a very small fraction of the time taken by the HMM and neural network classifiers and the training database is significantly reduced due to the use of frame prototypes instead of actual frames.
Samet MemişSerdar EnginoğluUğur Erkan
Bakhtiar Affendi RosdiHaryati JaafarDzati Athiar Ramli
Yonglei ZhouChangshui ZhangJingchun Wang
Muhammad ArifMuhammad Usman AkramFayyaz Minhas