JOURNAL ARTICLE

Feature Extraction Techniques for Deep Learning based Speech Classification

Abstract

The process of classifying audio data into several classes or categories is referred to as audio classification. The purpose of speaker recognition, one particular use of audio classification, is to recognize a person based on the characteristics of their speech. The phrase "voice recognition" refers to both speaker and speech recognition tasks. Speaker verification systems have grown significantly in popularity recently for a variety of uses, such as security measures and individualized help. Computers that have been taught to recognize individual voices can swiftly translate speech or confirm a speaker’s identification as part of a security procedure by identifying the speaker. Four decades of research have gone into speaker recognition, which is based on the acoustic characteristics of speech that differ from person to person. Some systems use auditory input from those seeking entry, just like fingerprint sensors match input fingerprint markings with a database or photographic attendance systems map inputs to a database. Personal assistants, like Google Home, for example, are made to limit access to those who have been given permission. Even under difficult circumstances, these systems must correctly identify or recognize the speaker. This research proposes a strong deep learning-based speaker recognition solution for audio categorization. We suggest self-augmenting the data utilizing four key noise aberration strategies to improve the system’s performance. Additionally, we conduct a comparison study to examine the efficacy of several audio feature extractors. The objective is to create a speaker identification system that is extremely accurate and can be applied in practical situations.

Keywords:
Computer science Speech recognition Speaker recognition Audio mining Speaker diarisation Categorization Feature (linguistics) Identification (biology) Feature extraction Process (computing) Fingerprint (computing) Artificial intelligence Speech processing Acoustic model

Metrics

3
Cited By
0.81
FWCI (Field Weighted Citation Impact)
18
Refs
0.68
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

SPEECH/MUSIC CLASSIFICATION USING WAVELET BASED FEATURE EXTRACTION TECHNIQUES

Thiruvengatanadhan RamalingamP. Dhanalakshmi

Journal:   Journal of Computer Science Year: 2014 Vol: 10 (1)Pages: 34-44
JOURNAL ARTICLE

Speech feature extraction and emotion recognition using deep learning techniques

Anil Kumar PagidirayiB. Anuradha

Journal:   i-manager s Journal on Digital Signal Processing Year: 2024 Vol: 12 (2)Pages: 1-1
JOURNAL ARTICLE

Deep Learning based Feature Extraction for Texture Classification

Philomina SimonV. Uma

Journal:   Procedia Computer Science Year: 2020 Vol: 171 Pages: 1680-1687
JOURNAL ARTICLE

Deep Learning-Based Feature Extraction for Speech Emotion Recognition

Dharmendra Kumar RoyNaga Venkata Gopi KumbhaHarender SankhlaGaurav RajBashetty Akhilesh

Journal:   International Journal of Engineering Technology and Management Sciences Year: 2024 Vol: 8 (3)Pages: 166-174
© 2026 ScienceGate Book Chapters — All rights reserved.