The rapid advancement of face manipulation technology has spurred an urgent need for forgery detection. Existing deepfake detection approaches have achieved impressive performance under the intra-dataset scenario where the same algorithm generates training and testing face data. However, the performance is by no means satisfactory when the methods are applied to unseen forgery datasets. To tackle this problem, in this paper, we propose a new perspective of face forgery detection by considering feature inconsistency in spatial and frequency domains in manipulated images. Specifically, we design a two-stream network equipped with a Multi-scale Mutual Local Consistency Learning module (MMLCL) that consists of a Global Enhancement Module (GEM) combining Mutual Local Consistency Learning (MLCL) to learn local consistency in multi-scale enhanced feature maps. We further exploit the mutual representation to obtain an attention map that serves as guidance of forged regions on the output features for final classification. Extensive experiments demonstrate that our proposed method achieves effectiveness and generalization towards unseen face forgeries.
Youqi SongZhentao ChenJunlin Hu
Xiaopeng WangFeng ZhuLei LiXiaoyang Tan
Chen ShenTaiping YaoYang ChenShouhong DingJilin LiRongrong Ji
Daichi ZhangZihao XiaoShikun LiFanzhao LinJianmin LiShiming Ge
Haoyu WuLingyun LengPeipeng Yu