🤖 AI Summary
This work addresses the significant performance disparities of existing deepfake detection models across different demographic groups. While prevailing fairness approaches often rely on demographic labels, require model retraining, or compromise overall accuracy, this paper proposes Face-Fairness (FF)—a plug-and-play calibration framework that operates without demographic annotations. FF leverages frozen pre-trained face embeddings and introduces a lightweight logit remapping module to optimize worst-group performance under label-free, label-available, or embedding-discovered groupings. The method incurs negligible computational overhead and is compatible with any off-the-shelf detector. Experiments demonstrate that FF substantially reduces FPR/TPR gaps both in-domain and across datasets, consistently improves worst-group performance, and maintains or even enhances overall detection accuracy.
📝 Abstract
Deepfake detectors show large performance gaps across demographic groups. Existing fairness approaches require demographic labels, retraining, or sacrifice accuracy. We introduce Face-Fairness (FF), a plug-and-play framework for bias mitigation. Our primary contribution, Face-Feature Tuning (FFT), is the first demographic label-free fairness method demonstrated for deepfake detection: a lightweight calibrator that performs a logit remapping conditioned on frozen face embeddings. We complement FFT with two variants: FF-Max, which maximizes worst-group accuracy when demographics are available, and FF-Discover, which does the same with embedding-discovered groups. Across in-domain and cross-dataset test settings, FF consistently reduces FPR/TPR gaps and improves minimum group accuracy while maintaining (often improving) overall accuracy. The approach is detector-agnostic, adds negligible runtime overhead, and requires no access to identity attributes.