Fine-Grained DINO Tuning with Dual Supervision for Face Forgery Detection

📅 2025-11-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deepfake detection faces challenges in distinguishing subtle, method-specific artifacts introduced by diverse generation techniques, while existing approaches—often reduced to binary classification—lack sufficient discriminative power. Method: This paper proposes a dual-supervised fine-grained tuning framework. It adopts DINOv2 as the backbone and introduces a lightweight multi-head LoRA adapter embedded in each Transformer block. A shared branch is further designed to propagate fine-grained manipulation cues, enabling joint optimization of authenticity assessment and forgery-type identification. Contribution/Results: With only 3.5 million trainable parameters, the framework achieves highly efficient fine-tuning. It attains state-of-the-art (SOTA) or superior detection accuracy on multiple mainstream benchmarks—including FaceForensics++, Celeb-DF, and DFDC—while significantly improving parameter efficiency and cross-dataset generalization capability compared to existing complex models.

Technology Category

Application Category

📝 Abstract
The proliferation of sophisticated deepfakes poses significant threats to information integrity. While DINOv2 shows promise for detection, existing fine-tuning approaches treat it as generic binary classification, overlooking distinct artifacts inherent to different deepfake methods. To address this, we propose a DeepFake Fine-Grained Adapter (DFF-Adapter) for DINOv2. Our method incorporates lightweight multi-head LoRA modules into every transformer block, enabling efficient backbone adaptation. DFF-Adapter simultaneously addresses authenticity detection and fine-grained manipulation type classification, where classifying forgery methods enhances artifact sensitivity. We introduce a shared branch propagating fine-grained manipulation cues to the authenticity head. This enables multi-task cooperative optimization, explicitly enhancing authenticity discrimination with manipulation-specific knowledge. Utilizing only 3.5M trainable parameters, our parameter-efficient approach achieves detection accuracy comparable to or even surpassing that of current complex state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Detects face forgeries by identifying distinct artifacts from different manipulation methods
Enhances authenticity discrimination through fine-grained manipulation type classification
Achieves high detection accuracy with parameter-efficient adaptation of DINOv2 backbone
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight LoRA modules adapt DINOv2 backbone
Dual supervision combines authenticity and manipulation classification
Shared branch propagates fine-grained cues for cooperative optimization
🔎 Similar Papers
No similar papers found.
T
Tianxiang Zhang
College of Cyber Security, Jinan University
P
Peipeng Yu
College of Cyber Security, Jinan University
Zhihua Xia
Zhihua Xia
Jinan University
Digital Forensics
L
Longchen Dai
College of Cyber Security, Jinan University
Xiaoyu Zhou
Xiaoyu Zhou
Peking University
Computer VisionAutonomous DrivingAI Security
H
Hui Gao
College of Cyber Security, Jinan University