With the rapid development of face forgery techniques, a large number of face synthesis videos are widely spread on the Internet, which threatens the security and trustworthiness of digital content online. It is necessary to develop face forgery detection methods. Many existing methods use only 2D CNNs to detect video frames. There are few 3D networks designed for face forgery detection. In this work, we propose to use 3D CNN for video-level face forgery detection and add a lightweight attention module to construct a 3D attention network. The network extracts both spatial and temporal features. The attention maps generated by the attention module focus on several forged regions of the fake face. To avoid the discrepancy of different regions affecting the detection results, a global attention pool is designed to replace the global average pool. The experiments implemented on FaceForensics++ show that our model achieves great accuracy and exceeds most existing methods. Cross-dataset evaluation implemented on Celeb-DF verifies that our model has strong transferability and generalization ability.
Xiang HanYongfeng QiLiqiang ZhuangLiang HuShengcong Wen
Zhenwu HuQianyue DuanPeiYu ZhangHuanjie Tao
LI Ke, LI Shaomei, JI Lixin, LIU Shuo
Chengsheng YuanPeipeng YuJianwei FeiYaju LiuHaopeng Liang