JOURNAL ARTICLE

Speech emotion recognition method based on time-aware bidirectional multi-scale network

Liyan ZhangJiaxin DuJiayan LiXinyu Wang

Year: 2024 Journal:   Journal of Physics Conference Series Vol: 2816 (1)Pages: 012102-012102   Publisher: IOP Publishing

Abstract

Abstract In response to the difficulty of traditional speech emotion recognition models in capturing long-distance dependencies in speech signals and the impact of changes in speaker pronunciation speed and pause time, this paper proposes a new time emotion modeling method called Time Perceived Bidirectional Multi-scale Network (TIM-Net), which is used to learn Multi-scale contextual emotion expression in different time scales. TIM-Net starts by acquiring temporal emotional representations using time-aware blocks. Subsequently, information from different time points is combined to enhance contextual understanding of emotional expression. Finally, it consolidates various Timescale features to better accommodate emotional fluctuations. The experiment shows that the network can focus useful information on features, and the WAR and UAR of TIM-Net are significantly better than other models on RAVDESS, EMO-DB, and EMOVO datasets.

Keywords:
Computer science Speech recognition Focus (optics) Scale (ratio) Pronunciation Emotional expression Expression (computer science) Net (polyhedron) Emotion recognition Artificial intelligence Psychology Cognitive psychology Linguistics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
11
Refs
0.17
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

BOOK-CHAPTER

Multi-scale Aggregation Network for Speech Emotion Recognition

An DangHa My LinhDuc-Quang Vu

Lecture notes in computer science Year: 2024 Pages: 63-73
JOURNAL ARTICLE

Speech Emotion Recognition with Global-Aware Fusion on Multi-Scale Feature Representation

Wenjing ZhuXiang Li

Journal:   ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Year: 2022 Pages: 6437-6441
© 2026 ScienceGate Book Chapters — All rights reserved.