Fair-FLIP: Fair Deepfake Detection with Fairness-Oriented Final Layer Input Prioritising

📅 2025-07-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deepfake detection models frequently exhibit significant demographic bias—particularly across race and gender subgroups—undermining fairness. To address this, we propose Fair-FLIP, a plug-and-play post-training fairness optimization method. It identifies high- and low-variance samples per subgroup via subgroup variance analysis and dynamically reweights inputs to the final classification layer, thereby suppressing bias amplification and enhancing robust subgroup representations. Fair-FLIP requires no architectural modification or model retraining and is compatible with mainstream deepfake detectors. Extensive experiments across multiple benchmark datasets demonstrate that Fair-FLIP incurs only a marginal 0.25% drop in overall detection accuracy while improving key fairness metrics—including equal opportunity difference and mean absolute error difference—by up to 30%. To our knowledge, this is the first approach to systematically mitigate group-level unfairness in deepfake detection without compromising strong detection performance.

Technology Category

Application Category

📝 Abstract
Artificial Intelligence-generated content has become increasingly popular, yet its malicious use, particularly the deepfakes, poses a serious threat to public trust and discourse. While deepfake detection methods achieve high predictive performance, they often exhibit biases across demographic attributes such as ethnicity and gender. In this work, we tackle the challenge of fair deepfake detection, aiming to mitigate these biases while maintaining robust detection capabilities. To this end, we propose a novel post-processing approach, referred to as Fairness-Oriented Final Layer Input Prioritising (Fair-FLIP), that reweights a trained model's final-layer inputs to reduce subgroup disparities, prioritising those with low variability while demoting highly variable ones. Experimental results comparing Fair-FLIP to both the baseline (without fairness-oriented de-biasing) and state-of-the-art approaches show that Fair-FLIP can enhance fairness metrics by up to 30% while maintaining baseline accuracy, with only a negligible reduction of 0.25%. Code is available on Github: https://github.com/szandala/fair-deepfake-detection-toolbox
Problem

Research questions and friction points this paper is trying to address.

Mitigate biases in deepfake detection across demographics
Maintain robust detection while enhancing fairness metrics
Reduce subgroup disparities with Fair-FLIP post-processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fair-FLIP reweights final-layer inputs for fairness
Reduces subgroup disparities with low variability prioritization
Maintains accuracy while enhancing fairness by 30%
🔎 Similar Papers
No similar papers found.