Multi-modal emotion recognition using semi-supervised learning and multiple neural networks in the wild

Dae Ha Kim; Min Kyu Lee; Dong Yoon Choi; Byung Cheol Song

doi:10.1145/3136755.3143005

ScienceGate Book Chapters

JOURNAL ARTICLE

Multi-modal emotion recognition using semi-supervised learning and multiple neural networks in the wild

Dae Ha Kim Min Kyu Lee Dong Yoon Choi Byung Cheol Song

Year: 2017 Pages: 529-535

DOI: 10.1145/3136755.3143005

Get Full-Text PDF Get Analytical Report

Abstract

Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

Keywords:

Computer science Artificial intelligence Landmark Artificial neural network Pattern recognition (psychology) Set (abstract data type) Deep learning Face (sociological concept) Modal Task (project management) Emotion recognition Emotion classification Image (mathematics) Speech recognition Machine learning

Metrics

Cited By

6.97

FWCI (Field Weighted Citation Impact)

Refs

0.96

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Emotion and Mood Recognition

Social Sciences → Psychology → Experimental and Cognitive Psychology

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multi-modal emotion recognition using semi-supervised learning and multiple neural networks in the wild

Abstract

Metrics

Citation History

Topics

Related Documents

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild

Cyclic Data Distillation Semi-Supervised Learning for Multi-Modal Emotion Recognition

Multi-modal Dimensional Emotion Recognition using Recurrent Neural Networks

Semi-supervised Multi-modal Emotion Recognition with Cross-Modal Distribution Matching

Speech Emotion Recognition Using Semi-supervised Learning with Ladder Networks