DISSERTATION

Voice emotion recognition using natural language processing deep learning

Abstract

The goal of emotion recognition is to give intelligent machines the ability to perceive, understand and express various emotional states. Emotion recognition is an important basis for realizing harmonious human-computer interaction. Before the advent of artificial intelligence technology, machines could process structured data. But most of the data in the network is unstructured, such as articles, pictures, audio, video, etc. In order to be able to analyze and utilize these text information, we need to use NLP technology to allow machines to understand and utilize these text information. This thesis will focus on the texts to analyze the voice emotion. Voice emotion recognition classification algorithms include naive Bayesian model, support vector machine (SVM) and hidden Markov model, etc[1]. The traditional machine learning model can have a good performance on classification problems, but the result presented by the traditional model is far less significant than that of the deep learning model. Therefore, this thesis uses the deep learning model, RNN model, and its variant models LSTM and GRU models, to train and predict. This thesis will compare the results of the traditional machine learning model, Naive Bayesian model, and deep learning models to demonstrate the superiority of deep learning models. In the multi-classification model, if a kind of class does not have obvious features, it may lead to poor model training results. Therefore, this project uses the idea of a soft margin to divide the third type as the middle range of the other two types to improve the efficiency of the model. When using the deep learning model, this thesis combines the sequence-to-sequence model and the traditional RNN model to form a new variant RNN model, which will improve the computational efficiency and accuracy of the model.--Author's abstract

Keywords:
Artificial intelligence Computer science Machine learning Deep learning Support vector machine Hidden Markov model Process (computing) Margin (machine learning) Artificial neural network Natural language processing

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
4
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Emotion and Mood Recognition
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.