Generalizable Deepfake Detection via Effective Local-Global Feature Extraction

📅 2025-01-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the increasing realism of generative AI–forged images (e.g., GANs, diffusion models), the fragmented modeling of local and global features in existing detection methods, and their poor cross-model generalization, this paper proposes a deepfake detection framework that jointly models local spatial-frequency features and global frequency-domain phase features. Our approach innovatively integrates sliding-window discrete wavelet transform (DWT) for localized feature extraction with an FFT phase-guided global frequency-domain attention mechanism—overcoming limitations of conventional feature concatenation or isolated modeling. We further design an end-to-end trainable multi-scale feature fusion architecture. Evaluated on an open-world dataset encompassing 34 diverse generative models, our method achieves a 2.9% higher detection accuracy than state-of-the-art approaches, demonstrating significantly improved open-set generalization and robustness against unseen generators and perturbations.

Technology Category

Application Category

📝 Abstract
The rapid advancement of GANs and diffusion models has led to the generation of increasingly realistic fake images, posing significant hidden dangers and threats to society. Consequently, deepfake detection has become a pressing issue in today's world. While some existing methods focus on forgery features from either a local or global perspective, they often overlook the complementary nature of these features. Other approaches attempt to incorporate both local and global features but rely on simplistic strategies, such as cropping, which fail to capture the intricate relationships between local features. To address these limitations, we propose a novel method that effectively combines local spatial-frequency domain features with global frequency domain information, capturing detailed and holistic forgery traces. Specifically, our method uses Discrete Wavelet Transform (DWT) and sliding windows to tile forged features and leverages attention mechanisms to extract local spatial-frequency domain information. Simultaneously, the phase component of the Fast Fourier Transform (FFT) is integrated with attention mechanisms to extract global frequency domain information, complementing the local features and ensuring the integrity of forgery detection. Comprehensive evaluations on open-world datasets generated by 34 distinct generative models demonstrate a significant improvement of 2.9% over existing state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

DeepFake Detection
Generative Adversarial Networks (GANs)
Diffusion Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete Wavelet Transform
Attention Mechanism
GANs and Diffusion Model Detection
🔎 Similar Papers
No similar papers found.