Fine-Grained DINO Tuning with Dual Supervision for Face Forgery Detection

📅 2025-11-15

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Deepfake detection faces challenges in distinguishing subtle, method-specific artifacts introduced by diverse generation techniques, while existing approaches—often reduced to binary classification—lack sufficient discriminative power. Method: This paper proposes a dual-supervised fine-grained tuning framework. It adopts DINOv2 as the backbone and introduces a lightweight multi-head LoRA adapter embedded in each Transformer block. A shared branch is further designed to propagate fine-grained manipulation cues, enabling joint optimization of authenticity assessment and forgery-type identification. Contribution/Results: With only 3.5 million trainable parameters, the framework achieves highly efficient fine-tuning. It attains state-of-the-art (SOTA) or superior detection accuracy on multiple mainstream benchmarks—including FaceForensics++, Celeb-DF, and DFDC—while significantly improving parameter efficiency and cross-dataset generalization capability compared to existing complex models.

Technology Category

Application Category

📝 Abstract

The proliferation of sophisticated deepfakes poses significant threats to information integrity. While DINOv2 shows promise for detection, existing fine-tuning approaches treat it as generic binary classification, overlooking distinct artifacts inherent to different deepfake methods. To address this, we propose a DeepFake Fine-Grained Adapter (DFF-Adapter) for DINOv2. Our method incorporates lightweight multi-head LoRA modules into every transformer block, enabling efficient backbone adaptation. DFF-Adapter simultaneously addresses authenticity detection and fine-grained manipulation type classification, where classifying forgery methods enhances artifact sensitivity. We introduce a shared branch propagating fine-grained manipulation cues to the authenticity head. This enables multi-task cooperative optimization, explicitly enhancing authenticity discrimination with manipulation-specific knowledge. Utilizing only 3.5M trainable parameters, our parameter-efficient approach achieves detection accuracy comparable to or even surpassing that of current complex state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Detects face forgeries by identifying distinct artifacts from different manipulation methods

Enhances authenticity discrimination through fine-grained manipulation type classification

Achieves high detection accuracy with parameter-efficient adaptation of DINOv2 backbone

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight LoRA modules adapt DINOv2 backbone

Dual supervision combines authenticity and manipulation classification

Shared branch propagates fine-grained cues for cooperative optimization

🔎 Similar Papers

Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture