Speech Emotion Recognition in Noisy Environments Based on a Denoising Convolutional Neural Network

Youngja Nam

doi:10.6109/jkiice.2023.27.6.772

ScienceGate Book Chapters

JOURNAL ARTICLE

Speech Emotion Recognition in Noisy Environments Based on a Denoising Convolutional Neural Network

Youngja Nam

Year: 2023 Journal: The Journal of the Korean Institute of Information and Communication Engineering Vol: 27 (6)Pages: 772-781

DOI: 10.6109/jkiice.2023.27.6.772

Get Full-Text PDF Get Analytical Report

Abstract

최근 딥러닝을 활용한 음성 감정 인식은 많은 관심을 받아왔다. 하지만 대부분 소음을 고려하지 않은 연구에 집중이 되었고, 소음하에서의 딥러닝 기반 음성 감정 인식 연구는 비교적 제한적이다. 나아가 소음하의 한국어 음성 감정인식 연구는 드물다. 본 연구는 Convolutional neural network (CNN)에 기반한 denoising CNN (DnCNN)을 활용하여 소음에 노출된 환경에서의 한국어 음성 감정 인식 양상을 두 가지 신호 대 잡음비를 사용하여 살펴보았다. 분석 결과, DnCNN은 신호 대 잡음비에 상관없이 CNN에 비해 높은 감정 분류 정확도를 보였다. 본 연구는 DnCNN을 사용하여 신호 대 잡음비를 달리하여 소음하 음성 감정 인식에서의 효용성을 파악한 첫 사례이다. 나아가 본 연구는 그간 반도체 웨이퍼의 결함 패턴 분류에 주로 사용된 DnCNN의 언어 도메인으로의 적용 확대성을 뒷받침한다. Speech emotion recognition using deep learning techniques has gained significant attention over the last decade. The majority of such research on speech emotion recognition has addressed noise-free speech emotion recognition. However, research on emotional speech denoising has received comparably less attention. Furthermore, limited data are available for such research using Korean emotional speech. This study examined Korean emotional speech recognition in noisy environments using a denoising convolutional neural network (DnCNN) which has primarily been used to detect defects on semiconductor wafers. The DnCNN performed better at classifying emotional categories than the CNN regardless of signal-to-noise ratio (SNR) conditions. This is the first study to provide evidence of the effectiveness of DnCNN in speech emotion recognition in noisy conditions at different levels of SNRs. In addition, the experimental results suggest the possibility of extending the applicability of DnCNN in the speech domain.

Keywords:

Convolutional neural network Speech recognition Computer science Noise reduction Noise (video) Artificial intelligence Deep learning Emotion recognition Pattern recognition (psychology)

Metrics

Cited By

0.27

FWCI (Field Weighted Citation Impact)

Refs

0.43

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Emotion Recognition in Noisy Environments Based on a Denoising Convolutional Neural Network

Abstract

Metrics

Citation History

Topics

Related Documents

Convolutional Neural Network-Based Optimization of Automatic Speech Recognition for Noisy Environments

Hybrid Deep Convolutional Neural Network based Speaker Recognition for Noisy Speech Environments

Speech emotion recognition based on convolutional neural network

Convolutional Neural Network (CNN) Based Speech-Emotion Recognition

Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions