Emotionally Enhanced Talking Face Generation

Sahil Goyal; Sarthak Bhagat; Shagun Uppal; Hitkul Jangra; Yi Yu; Yifang Yin; Rajiv Ratn Shah

doi:10.1145/3607541.3616812

ScienceGate Book Chapters

JOURNAL ARTICLE

Emotionally Enhanced Talking Face Generation

Sahil Goyal Sarthak Bhagat Shagun Uppal Hitkul Jangra Yi Yu Yifang Yin Rajiv Ratn Shah

Year: 2023 Pages: 81-90

DOI: 10.1145/3607541.3616812

Get Full-Text PDF Get Analytical Report

Abstract

Several works have developed end-to-end pipelines for generating lip-synced talking faces with real-world applications, such as teaching and language translation in videos. However, these prior works fail to create realistic-looking videos since they focus little on people's expressions and emotions. Moreover, these methods' effectiveness largely depends on the faces in the training dataset, which means they may not perform well on unseen faces. To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing. With a broad range of six emotions, i.e., happiness, sadness, fear, anger, disgust, and neutral, we show that our model can adapt to arbitrary identities, emotions, and languages. Our proposed framework has a user-friendly web interface with a real-time experience for talking face generation with emotions. We also conduct a user study for subjective evaluation of our interface's usability, design, and functionality. Project page: \hrefhttps://midas.iiitd.edu.in/emo/ https://midas.iiitd.edu.in/emo/ .

Keywords:

Disgust Sadness Computer science Happiness Human–computer interaction Facial expression Anger Usability Face (sociological concept) Focus (optics) Interface (matter) Body language Multimedia Artificial intelligence Psychology Social psychology Linguistics

Metrics

Cited By

2.18

FWCI (Field Weighted Citation Impact)

Refs

0.86

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Generative Adversarial Networks and Image Synthesis

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Face recognition and analysis

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Emotionally Enhanced Talking Face Generation

Abstract

Metrics

Citation History

Topics

Related Documents

Multimodality-Driven Emotionally Controllable Talking Face Generation

Emotionally Controllable Audio-driven Talking Face Generation

EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model

Emotionally Controllable Talking Face Generation from an Arbitrary Emotional Portrait

Lip-synchronized Talking Face Generation with Enhanced Mouth Movement