Real-time Emotion Pre-Recognition in Conversations with Contrastive Multi-modal Dialogue Pre-training

Xincheng Ju; Dong Zhang; Suyang Zhu; Junhui Li; Shoushan Li; Guodong Zhou

doi:10.1145/3583780.3615024

ScienceGate Book Chapters

JOURNAL ARTICLE

Real-time Emotion Pre-Recognition in Conversations with Contrastive Multi-modal Dialogue Pre-training

Xincheng Ju Dong Zhang Suyang Zhu Junhui Li Shoushan Li Guodong Zhou

Year: 2023 Pages: 1045-1055

DOI: 10.1145/3583780.3615024

Get Full-Text PDF Get Analytical Report

Abstract

This paper presents our pioneering effort in addressing a new and realistic scenario in multi-modal dialogue systems called Multi-modal Real-time Emotion Pre-recognition in Conversations (MREPC). The objective is to predict the emotion of a forthcoming target utterance that is highly likely to occur. We believe that this task can enhance the dialogue system's understanding of the interlocutor's state of mind, enabling it to prepare an appropriate response in advance. However, addressing MREPC poses the following challenges:1) Previous studies on emotion elicitation typically focus on textual modality and perform sentiment forecasting within a fixed contextual scenario. 2) Previous studies on multi-modal emotion recognition aim to predict the emotion of existing utterances, making it difficult to extend these approaches to MREPC due to the absence of the target utterance. To tackle these challenges, we construct two benchmark multi-modal datasets for MREPC and propose a task-specific multi-modal contrastive pre-training approach. This approach leverages large-scale unlabeled multi-modal dialogues to facilitate emotion pre-recognition for potential utterances of specific target speakers. Through detailed experiments and extensive analysis, we demonstrate that our proposed multi-modal contrastive pre-training architecture effectively enhances the performance of multi-modal real-time emotion pre-recognition in conversations.

Keywords:

Computer science Utterance Modal Focus (optics) Benchmark (surveying) Task (project management) Natural language processing Modality (human–computer interaction) Artificial intelligence Speech recognition Emotion recognition Engineering

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.18

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Emotion and Mood Recognition

Social Sciences → Psychology → Experimental and Cognitive Psychology

Sentiment Analysis and Opinion Mining

Physical Sciences → Computer Science → Artificial Intelligence

Humor Studies and Applications

Social Sciences → Psychology → Social Psychology

Real-time Emotion Pre-Recognition in Conversations with Contrastive Multi-modal Dialogue Pre-training

Abstract

Metrics

Topics

Related Documents

Multi-Modal Contrastive Pre-training for Recommendation

LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations

Multi-Modal Pre-Training for Automated Speech Recognition

Cross-Modal Contrastive Pre-Training for Few-Shot Skeleton Action Recognition

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition