JOURNAL ARTICLE

DFDT: An End-to-End DeepFake Detection Framework Using Vision Transformer

Aminollah KhormaliJ.S. Yuan

Year: 2022 Journal:   Applied Sciences Vol: 12 (6)Pages: 2953-2953   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

The ever-growing threat of deepfakes and large-scale societal implications has propelled the development of deepfake forensics to ascertain the trustworthiness of digital media. A common theme of existing detection methods is using Convolutional Neural Networks (CNNs) as a backbone. While CNNs have demonstrated decent performance on learning local discriminative information, they fail to learn relative spatial features and lose important information due to constrained receptive fields. Motivated by the aforementioned challenges, this work presents DFDT, an end-to-end deepfake detection framework that leverages the unique characteristics of transformer models, for learning hidden traces of perturbations from both local image features and global relationship of pixels at different forgery scales. DFDT is specifically designed for deepfake detection tasks consisting of four main components: patch extraction & embedding, multi-stream transformer block, attention-based patch selection followed by a multi-scale classifier. DFDT’s transformer layer benefits from a re-attention mechanism instead of a traditional multi-head self-attention layer. To evaluate the performance of DFDT, a comprehensive set of experiments are conducted on several deepfake forensics benchmarks. Obtained results demonstrated the surpassing detection rate of DFDT, achieving 99.41%, 99.31%, and 81.35% on FaceForensics++, Celeb-DF (V2), and WildDeepfake, respectively. Moreover, DFDT’s excellent cross-dataset & cross-manipulation generalization provides additional strong evidence on its effectiveness.

Keywords:
Computer science Artificial intelligence Transformer Convolutional neural network Machine learning Discriminative model End-to-end principle Pattern recognition (psychology)

Metrics

65
Cited By
7.92
FWCI (Field Weighted Citation Impact)
69
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Digital Media Forensic Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Generative Adversarial Networks and Image Synthesis
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

SFormer: An end-to-end spatio-temporal transformer architecture for deepfake detection

Staffy KingraNaveen AggarwalNirmal Kaur

Journal:   Forensic Science International Digital Investigation Year: 2024 Vol: 51 Pages: 301817-301817
JOURNAL ARTICLE

DeepFake Video Detection using Vision Transformer

Shereen HussienSeif Mohamed

Journal:   International journal of intelligent computing and information sciences/International Journal of Intelligent Computing and Information Sciences Year: 2024 Vol: 0 (0)Pages: 0-0
© 2026 ScienceGate Book Chapters — All rights reserved.