MENG Sijiang, WANG Hongxia, ZENG Qiang, ZHOU Yang
With the improvement and application of various digital platforms,document images have been widely spread on the Internet.At the same time,the development of image processing technology has increased the risk of document image tampering,making it crucial to ensure the integrity and authenticity of document images.In this paper,we propose multi-view and multi-scale fusion attention network(MM-Net),aiming for improving the accuracy of document image forgery localization in real-world.We adopt multi-view encoder combined with RGB information,noise information,and character information to fully extract tampering features.A multi-scale fusion attention module is designed to facilitate the interaction of multi-scale features,thus enhancing important content information in document images.Extensive experimental results on the large-scale dataset DocTamper demonstrate that the proposed MM-Net achieves more precise localization of tampered regions in document images,with F-score of 0.809,0.807,and 0.774 on the test dataset,cross domain dataset FCD and SCD,respectively.Moreover,MM-Net exhibits good generalizability and robustness.
Enji LiangKuiyuan ZhangZhongyun HuaXiaohua Jia
Yushu ZhangQing Mei TanShuren QiMingfu Xue
Wenhui GongYan ChenMohammad S. AlamJun Sang
Yanqing GuoCaijuan JiXin ZhengQianyu WangXiangyang Luo