CNN-based Speech Emotion Recognition using Transfer Learning

Hee Won Choee; Seung Min Park; Kwee-Bo Sim

doi:10.5391/jkiis.2019.29.5.339

ScienceGate Book Chapters

JOURNAL ARTICLE

CNN-based Speech Emotion Recognition using Transfer Learning

Hee Won Choee Seung Min Park Kwee-Bo Sim

Year: 2019 Journal: Journal of Korean institute of intelligent systems Vol: 29 (5)Pages: 339-344 Publisher: Korean Institute of Intelligent Systems

DOI: 10.5391/jkiis.2019.29.5.339

Get Full-Text PDF Get Analytical Report

Abstract

로봇은 사람의 편의를 위해 존재하므로 사람과 로봇의 상호작용은 중요하다. 로봇이 사람의 감정을 파악하는 것은 여러상호작용 중 하나이다. 최근 사람의 음성으로 감정을 인식하는 음성 감정 인식(speech emotion recognition; SER)분야는 딥러닝 (deep learning)의 접목으로 그 성능이 향상되고 있다. 하지만, 데이터의 부족으로 깊은 신경망을 사용하거나 추가적인 학습 기법을 적용하지 않고서는 높은 정확도를 기대하기 힘들다. 본 논문에서는 데이터가 부족할 때 사용하는 학습기법 중의 하나인 전이학습 (transfer learning)을 SER에 적용한 효과를 확인한다. 딥러닝을 적용하기 위해 합성곱 신경망 (convolutional neural networks; CNN) 구조를 사용한다. 전이학습에 음성 감정 데이터가 아닌 일반 소리 데이터를 사용하여 데이터 개수에 대한 한계를 없앤다. 전이학습 중 특징 추출기 (feature extractor)로써 사용한 경우와 미세조정 (fine tuning)을 한 경우로 나누어 결과를 확인한다. 그 결과, 미세조정한 경우 수렴 시간이 약 20% 줄었고, 특징 추출기로써 사용한 경우 약 20%에서 70% 줄었다. 정확도는 특징 추출기로써 사용한 경우 오히려 정확도가 감소하는 경우가 발생하였고 증가한 경우 약 3% 증가했다. 미세조정을 한 경우 정확도가 평균적으로 약 7% 향상되었다.

Keywords:

Transfer of learning Convolutional neural network Extractor Computer science Speech recognition Feature (linguistics) Emotion recognition Artificial intelligence Deep learning Feature learning Pattern recognition (psychology) Engineering Linguistics

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

Citation Normalized Percentile

Is in top 1%

Is in top 10%

CNN-based Speech Emotion Recognition using Transfer Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Speech Emotion Recognition Using Transfer Learning

Speech Emotion Recognition Using CNN

Speech Emotion Recognition using CNN

Speech Emotion Recognition Using CNN

CNN-based Speech Emotion Recognition Model Applying Transfer Learning and Attention Mechanism