Current face sketch-photo synthesis researches generally embrace an image-to-image (I2I) translation pipeline. However, these methods ignore the one-to-many mapping problem (i.e., multiple plausible photo results can correspond to a single input sketch) in sketch-to-photo synthesis task, resulting in significant performance degradation on diverse datasets. Besides, generating high-quality images on limited data is also a challenge for this task. To address these challenges, we propose a dual-path framework that introduces generative priors to better perform cross-domain reconstruction on limited data. The coarse path uses a layer-swapped pre-trained generator to achieve coarse cross-domain reconstruction, and the refinement path further improves the structure and texture details. To align the feature maps between the two paths, we introduce a spatial feature calibration module. Despite this, our framework still struggles to handle diverse datasets. Thanks to the flexibility of generative priors, we can extend the framework to achieve exemplar-guided I2I translation by incorporating an exemplar with style mixing and a proposed semantic-aware style refinement strategy, which addresses the one-to-many mapping problem in sketch-to-photo synthesis task. Furthermore, our framework can perform cross-domain editing by employing off-the-shelf editing methods based on the latent space, achieving fine-grained control. Extensive experiments on diverse datasets demonstrate the superiority of our framework over other state-of-the-art methods.
Kun ChengMingrui ZhuNannan WangXinbo Gao
Kancharagunta Kishan BabuSaroj Kr. BiswasP. NathAtiya KhanAkhil Das
Nannan WangDacheng TaoXinbo GaoXuelong LiJie Li
Jieying ZhengWanru SongYahong WuRan XuFeng Liu
Wentao ChaoLiang ChangXuguang WangJian ChengXiaoming DengFuqing Duan