JOURNAL ARTICLE

Sketch-Guided Latent Diffusion Model for High-Fidelity Face Image Synthesis

Yichen PengChunqi ZhaoHaoran XieTsukasa FukusatoKazunori Miyata

Year: 2023 Journal:   IEEE Access Vol: 12 Pages: 5770-5780   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Synthesizing facial images from monochromatic sketches is one of the most fundamental tasks in the field of image-to-image translation. However, it is still challenging to teach model high-dimensional face features, such as geometry and color, and to the characteristics of input sketches, which should be considered simultaneously. Existing methods often use sketches as indirect inputs (or as auxiliary inputs) to guide models, resulting in the loss of sketch features or in alterations to geometry information. In this paper, we introduce a Sketch-Guided Latent Diffusion Model (SGLDM), an LDM-based network architecture trained on the paired sketch-face dataset. We apply a Multi-Auto-Encoder (AE) to encode the different input sketches from the various regions of a face from the pixel space into a feature map in the latent space, enabling us to reduce the dimensions of the sketch input while preserving the geometry-related information of the local face details. We build a sketch-face paired dataset based on an existing method XDoG and Sketch Simplification that extracts the edge map from an image. We then introduce a Stochastic Region Abstraction (SRA), an approach to augmenting our dataset to improve the robustness of the SGLDM to handle arbitrarily abstract sketch inputs. The evaluation study shows that the SGLDM can synthesize high-quality face images with different expressions, facial accessories, and hairstyles from various sketches having different abstraction levels, and the code and model have been released on the project page. https://puckikk1202.github.io/difffacesketch2023/

Keywords:
Sketch Computer science Artificial intelligence Face (sociological concept) Abstraction Robustness (evolution) Computer vision Rendering (computer graphics) Encoder Feature (linguistics) Sketch recognition Image translation Pattern recognition (psychology) Image (mathematics) Algorithm Gesture

Metrics

11
Cited By
2.00
FWCI (Field Weighted Citation Impact)
40
Refs
0.85
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Generative Adversarial Networks and Image Synthesis
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Face recognition and analysis
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image Processing Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.