🤖 AI Summary
In high-stakes AI applications, fairness across intersecting demographic groups and model transparency are critical. This work proposes the first unified framework based on mixed-integer optimization (MIO) to directly train classifiers that are both inherently interpretable and intersectionally fair. Theoretical analysis establishes the equivalence between the Mean Signed Difference (MSD) and the Subgroup Pairwise Statistical Fairness (SPSF) metrics in identifying the most disadvantaged subgroups. Empirical results demonstrate that the proposed method effectively constrains intersectional bias within a user-specified threshold while maintaining high predictive performance, significantly outperforming existing approaches in both bias detection and mitigation.
📝 Abstract
The deployment of Artificial Intelligence in high-risk domains, such as finance and healthcare, necessitates models that are both fair and transparent. While regulatory frameworks, including the EU's AI Act, mandate bias mitigation, they are deliberately vague about the definition of bias. In line with existing research, we argue that true fairness requires addressing bias at the intersections of protected groups. We propose a unified framework that leverages Mixed-Integer Optimization (MIO) to train intersectionally fair and intrinsically interpretable classifiers. We prove the equivalence of two measures of intersectional fairness (MSD and SPSF) in detecting the most unfair subgroup and empirically demonstrate that our MIO-based algorithm improves performance in finding bias. We train high-performing, interpretable classifiers that bound intersectional bias below an acceptable threshold, offering a robust solution for regulated industries and beyond.