Suppressing Forgery-Specific Shortcuts for Generalizable Deepfake Detection

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Current deepfake detection models exhibit limited generalization due to their reliance on spurious shortcut features introduced by specific forgery methods. This work proposes the Shortcut Subspace Suppression (S³) framework, which explicitly models and suppresses forgery-specific shortcut feature subspaces for the first time. By employing a lightweight linear probe combined with singular value decomposition, S³ identifies critical shortcut directions and softly suppresses them during training without requiring any architectural modifications or additional adjustments at inference time, enabling plug-and-play generalization enhancement. The method significantly improves cross-forgery detection performance across multiple benchmarks while maintaining strong in-domain accuracy and enhancing model interpretability.

📝 Abstract

Deepfake detection suffers from poor generalization across forgery methods, as existing models tend to rely on spurious method-specific shortcuts that fail to transfer to unseen manipulations. While recent approaches attempt to improve generalization, they lack an explicit mechanism to identify and suppress such shortcuts in learned representations. In this work, we propose Shortcut Subspace Suppression (S^3) framework that explicitly characterizes and suppresses method-specific shortcuts via subspace modeling. Our key insight is that variations distinguishing different forgery methods capture method-specific artifacts and thus serve as an effective proxy for method-specific shortcuts. To this end, we train a lightweight linear probe for forgery method classification and perform Singular Value Decomposition (SVD) to extract the dominant shortcut subspace. Building on this formulation, we develop two complementary strategies to reduce shortcut reliance. During training, we softly suppress the shortcut subspace in feature representations, encouraging the model to rely on more generalizable cues for real/fake discrimination. At inference time, we introduce a training-free counterpart that attenuates neurons aligned with the identified shortcut directions, enabling plug-and-play generalization enhancement with improved interpretability. Extensive experiments on multiple benchmarks demonstrate that our method significantly improves cross-method generalization while maintaining strong in-domain performance. The code will be released upon acceptance of the submission.

Problem

Research questions and friction points this paper is trying to address.

deepfake detection

generalization

shortcuts

forgery methods

spurious correlations

Innovation

Methods, ideas, or system contributions that make the work stand out.

deepfake detection

shortcut suppression

subspace modeling