Advancing Speech Emotion Recognition with Interpretable Neural Networks and Self-Supervised Paralinguistic Representations

Vu, Linh Ngoc

doi:10.26180/29117351

ScienceGate Book Chapters

JOURNAL ARTICLE

Advancing Speech Emotion Recognition with Interpretable Neural Networks and Self-Supervised Paralinguistic Representations

Vu, Linh Ngoc

Year: 2025 Journal: Monash University

DOI: 10.26180/29117351

Get Full-Text PDF Get Analytical Report

Abstract

This research focuses on novel approaches for speech-based emotion recognition (SER). SER technologies can be applied in various contexts, such as assessing customer satisfaction in call centers, tracking personal moods, and monitoring emotions in healthcare settings. Numerous machine learning methods have been proposed, ranging from traditional feature-based models to end-to-end interpretable neural networks and self-supervised learning techniques. These methods have produced explainable representations and identifiable features related to vocal cues, which we refer to as paralinguistic representations (i.e., beyond linguistics). By incorporating a pre-trained paralinguistic representation, our method achieved accuracy comparable to state-of-the-art techniques while maintaining high efficiency. A detailed analysis of errors and metadata indicated that our proposed method reduces gender bias and generalizes well to unseen speakers and spontaneous emotions, extending beyond recordings of scripted utterances.

Keywords:

Paralanguage Artificial neural network Metadata Emotion recognition Representation (politics) Emotion detection Deep learning

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.56

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Digital Filter Design and Implementation

Physical Sciences → Computer Science → Signal Processing

Blind Source Separation Techniques

Physical Sciences → Computer Science → Signal Processing

Numerical Methods and Algorithms

Physical Sciences → Computer Science → Computational Theory and Mathematics

Advancing Speech Emotion Recognition with Interpretable Neural Networks and Self-Supervised Paralinguistic Representations

Abstract

Metrics

Topics

Related Documents

Advancing Speech Emotion Recognition with Interpretable Neural Networks and Self-Supervised Paralinguistic Representations

Evaluating Self-Supervised Speech Representations for Speech Emotion Recognition

Universal Paralinguistic Speech Representations Using self-Supervised Conformers

Towards Paralinguistic-Only Speech Representations for End-to-End Speech Emotion Recognition

Noise-Robust Speech Emotion Recognition Using Shared Self-Supervised Representations with Integrated Speech Enhancement