Speech Emotion Recognition Using Transfer Learning and Self-Supervised Speech Representation Learning

Babak Nasersharif; Marziye Azad

doi:10.1109/icee59167.2023.10334799

ScienceGate Book Chapters

JOURNAL ARTICLE

Speech Emotion Recognition Using Transfer Learning and Self-Supervised Speech Representation Learning

Babak Nasersharif Marziye Azad

Year: 2023 Vol: 33 Pages: 684-689

DOI: 10.1109/icee59167.2023.10334799

Get Full-Text PDF Get Analytical Report

Abstract

Self-supervised speech representation learning (S3RL) models like wav2vec2.0, Hidden-unit BERT (HuBERT), and WavLM are trained with a great amount of speech data and subsequently give a general purpose speech representation that then needs to be finetuned for different speech processing tasks like ASR. Despite these models' good performance, they suffer from massive structures and a great number of parameters which makes their finetuning inapplicable for low-resource tasks like speech emotion recognition. In this paper, a small model is introduced for speech emotion recognition based on the Hubert model by transferring the Hubert convolutional feature encoder and substituting all of its transformers with a simple conformer block. Then this simple model is trained with emotional speech signals. The experimental results indicate that the proposed model has comparable results with other well-performing S3RL models.

Keywords:

Computer science Speech recognition Encoder Feature learning Speech processing Transfer of learning Artificial intelligence Natural language processing Language model Speech analytics Transformer Emotion recognition Representation (politics) Feature (linguistics) Convolutional neural network Acoustic model

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.17

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Emotion Recognition Using Transfer Learning and Self-Supervised Speech Representation Learning

Abstract

Metrics

Topics

Related Documents

Speech Emotion Recognition Using Transfer Learning

Adapting a Self-Supervised Speech Representation for Noisy Speech Emotion Recognition by Using Contrastive Teacher-Student Learning

Representation Learning for Speech Emotion Recognition

Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge

Speech Emotion Recognition Using Self-Supervised Features