Information-Theoretic Fairness with A Bounded Statistical Parity Constraint

📅 2025-08-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates information-theoretic modeling of fair representation learning: maximizing task relevance $I(Y;T)$ subject to statistical parity ($I(Y;S)leqepsilon$) and compression ($I(Y;X)leq r$) constraints. We propose a novel analytical framework based on an extended Strong Functional Representation Lemma, enabling tight upper and lower bounds on achievable utility. Crucially, we establish—for the first time—that permitting controlled information leakage ($epsilon>0$) strictly improves representation utility, thereby relaxing the restrictive perfect fairness assumption. By incorporating randomized representation mechanisms and refined mutual information bounds, we explicitly characterize the privacy–utility trade-off. Our theoretical analysis yields both an achievable upper bound and a constructive lower bound for fair representations. Empirical evaluation across multiple benchmark datasets demonstrates that, under identical sensitive-information leakage levels, our method improves task accuracy by 3.2–7.8 percentage points over state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract
In this paper, we study an information-theoretic problem of designing a fair representation that attains bounded statistical (demographic) parity. More specifically, an agent uses some useful data $X$ to solve a task $T$. Since both $X$ and $T$ are correlated with some sensitive attribute or secret $S$, the agent designs a representation $Y$ that satisfies a bounded statistical parity and/or privacy leakage constraint, that is, such that $I(Y;S) leq ε$. Here, we relax the perfect demographic (statistical) parity and consider a bounded-parity constraint. In this work, we design the representation $Y$ that maximizes the mutual information $I(Y;T)$ about the task while satisfying a bounded compression (or encoding rate) constraint, that is, ensuring that $I(Y;X) leq r$. Simultaneously, $Y$ satisfies the bounded statistical parity constraint $I(Y;S) leq ε$. To design $Y$, we use extended versions of the Functional Representation Lemma and the Strong Functional Representation Lemma which are based on randomization techniques and study the tightness of the obtained bounds in special cases. The main idea to derive the lower bounds is to use randomization over useful data $X$ or sensitive data $S$. Considering perfect demographic parity, i.e., $ε=0$, we improve the existing results (lower bounds) by using a tighter version of the Strong Functional Representation Lemma and propose new upper bounds. We then propose upper and lower bounds for the main problem and show that allowing non-zero leakage can improve the attained utility. Finally, we study the bounds and compare them in a numerical example. The problem studied in this paper can also be interpreted as one of code design with bounded leakage and bounded rate privacy considering the sensitive attribute as a secret.
Problem

Research questions and friction points this paper is trying to address.

Designing fair data representations with bounded statistical parity constraints
Maximizing task information under compression and privacy leakage limits
Balancing utility and fairness using information-theoretic randomization techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bounded statistical parity constraint via information theory
Randomization techniques using Functional Representation Lemmas
Maximizing task mutual information under compression constraints
🔎 Similar Papers
No similar papers found.
A
Amirreza Zamani
Division of Information Science and Engineering, KTH Royal Institute of Technology
A
Abolfazl Changizi
Division of Information Science and Engineering, KTH Royal Institute of Technology
Ragnar Thobaben
Ragnar Thobaben
Professor, KTH Royal Institute of Technology
coding theoryinformation theorycommunication theorysignal precessingmachine learning
Mikael Skoglund
Mikael Skoglund
KTH Royal Institute of Technology
Information TheoryCommunicationsSignal Processing