Face-MakeUp: Multimodal Facial Prompts for Text-to-Image Generation

Dawei Dai; Mingming Jia; Yifan Zhou; Hang Xing; Yuezhe Li

doi:10.3233/faia251175

ScienceGate Book Chapters

BOOK-CHAPTER

Face-MakeUp: Multimodal Facial Prompts for Text-to-Image Generation

Dawei Dai Mingming Jia Yifan Zhou Hang Xing Yuezhe Li

Year: 2025 Frontiers in artificial intelligence and applications

DOI: 10.3233/faia251175

Get Full-Text PDF Get Analytical Report

Abstract

Facial images have extensive practical applications. Although the current large-scale text-image diffusion models exhibit strong generation capabilities, it is challenging to generate the desired facial images using only text prompt. Image prompts are a logical choice. However, current methods of this type generally focus on general domain. In this paper, we aim to optimize image makeup techniques to generate the desired facial images. Specifically, (1) we built a dataset of 4 million high-quality face image-text pairs based on the FaceCaption-15M and LAION-Face to train our Face-MakeUp model; (2) to maintain consistency with the reference facial image, we extract/learn multi-scale content features and pose features for the facial image, integrating these into the diffusion model to enhance the preservation of facial identity features for diffusion models. Validation on two face-related test datasets demonstrates that our Face-MakeUp can achieve the best comprehensive performance. All codes, data, and model checkpoints are available at: https://github.com/ddw2AIGROUP2CQUPT/Face-MakeUp.

Keywords:

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.81

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Face recognition and analysis

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Subtitles and Audiovisual Media

Social Sciences → Arts and Humanities → Language and Linguistics

Digital Media and Visual Art

Physical Sciences → Computer Science → Computer Graphics and Computer-Aided Design

Face-MakeUp: Multimodal Facial Prompts for Text-to-Image Generation

Abstract

Metrics

Topics

Related Documents

Beyond Text-to-Image: Multimodal Prompts to Explore Generative AI

Emergent Text-to-Image Generation Using Short Neologism Prompts and Negative Prompts

MM2Latent: Text-to-Facial Image Generation and Editing in GANs with Multimodal Assistance

Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation

NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation