Measuring Database Unfairness via Dependency Quantification Under Differential Privacy

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

224K/year
🤖 AI Summary
This work addresses the challenge of effectively evaluating dataset fairness and reliability under differential privacy constraints. It introduces, for the first time, a formal framework that integrates data unfairness metrics with differential privacy, grounded in three core principles: positivity, monotonicity, and computability. The framework encompasses three complementary fairness measures: privacy-preserving proxies based on mutual information and total variation distance, a weighted MaxSAT-driven approximation for data repair, and a top-k tuple contribution analysis. Accompanying privacy-preserving algorithms are developed for each measure. Experimental results demonstrate that the proposed approach accurately approximates non-private fairness metrics under strong privacy guarantees, effectively quantifies bias, and yields interpretable fairness insights, thereby offering practical tools for data management in privacy-sensitive settings.
📝 Abstract
Differential privacy (DP) has become the de facto standard for protecting sensitive data, providing strong guarantees that published statistics or models reveal limited information about any individual. However, privacy noise and restricted data access make it increasingly difficult to assess the fairness and reliability of private datasets. In this paper, we propose a formal framework for quantifying data unfairness under DP. We identify three core desiderata for unfairness measures based on previous work: positivity, monotonicity, and DP computability. We further instantiate them through three complementary measures: (1) a mutual information-based measure with a total variation distance proxy suitable for DP, (2) a data repair-based measure approximated via a reduction to weighted MaxSAT, and (3) a top-$k$ tuple contribution measure that isolates the most influential records in fairness violations. We design privacy-preserving algorithms and analyze their sensitivity, accuracy, and efficiency. Extensive experiments on multiple real-world datasets demonstrate that our proposed measures faithfully approximate their non-private counterparts, effectively quantify bias under privacy constraints, and provide insights for data management.
Problem

Research questions and friction points this paper is trying to address.

differential privacy
data unfairness
fairness measurement
database bias
privacy-preserving
Innovation

Methods, ideas, or system contributions that make the work stand out.

differential privacy
fairness quantification
mutual information
data repair
top-k contribution
🔎 Similar Papers
No similar papers found.