Image harmonization is a crucial technique in image composition that aims to seamlessly match the background by adjusting the foreground of composite images. Current methods adopt either global-level or pixel-level feature matching. Global-level feature matching ignores the proximity prior, treating foreground and background as separate entities. On the other hand, pixel-level feature matching loses contextual information. Therefore, it is necessary to use the information from semantic maps that describe different objects to guide harmonization. In this paper, we propose Semantic-guided Region-aware Instance Normalization (SRIN) that can utilize the semantic segmentation maps output by a pre-trained Segment Anything Model (SAM) to guide the visual consistency learning of foreground and background features. Abundant experiments demonstrate the superiority of our method for image harmonization over state-of-the-art methods.
Sangjun NohJongwon KimDongwoo NamSeunghyeok BackRaeyoung KangKyoobin Lee
Hanyu JiangXing LanJiayi LyuKun DongJian Xue
Frano RajičLei KeYu‐Wing TaiChi–Keung TangMartin DanelljanFisher Yu
Erdal AkinHéctor CaltencoKayode S. AdewoleReza MalekianJan Persson
Chenyu ZhangSongshan HuangRuwen Qin