JOURNAL ARTICLE

Artificial bandwidth extension to improve automatic emotion recognition from narrow-band coded speech

Abstract

Narrow-band speech coding techniques were previously found to reduce the accuracy of automatic Speech Emotion Recognition (SER), as well as speech and speaker recognition rates. Artificial Bandwidth Extension (ABE) based on spectral folding and spectral envelope estimation has been applied to compressed narrowband speech to test if an improvement in SER can be achieved. The modelling and classification of speech was performed with a benchmark approach based on the GMM classifier and a set of speech acoustic parameters including MFCCs, TEO and glottal parameters. The tests used the Berlin Emotional Speech data base. In general, ABE led to an improvement of SER accuracy; however the amount of improvement varied between different features, genders, and speech compression rates. In all cases, SER accuracy with ABE was at least 10% lower than for uncompressed speech.

Keywords:
Computer science Speech recognition Spectral envelope Speech coding Linear predictive coding Voice activity detection Uncompressed video Narrowband Bandwidth (computing) Classifier (UML) Acoustic model Speech processing Artificial intelligence Pattern recognition (psychology)

Metrics

5
Cited By
0.28
FWCI (Field Weighted Citation Impact)
36
Refs
0.84
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Advanced Data Compression Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.