Machine Learning with Multitype Protected Attributes: Intersectional Fairness through Regularisation

📅 2025-09-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses cross-group fairness in regression and classification under multiple protected attributes (e.g., gender, race, age). We propose a novel regularization framework grounded in distance covariance. Specifically, we introduce Joint Distance Covariance (JdCov) and Concatenated Cross-Group Distance Covariance (CCdCov) to jointly model continuous and discrete attribute combinations, effectively capturing nonlinear group dependencies. To dynamically calibrate regularization strength, we incorporate Jensen–Shannon divergence to quantify and constrain predictive distribution disparities across intersectional subgroups (e.g., Black women, Hispanic men). Experiments on the COMPAS dataset and real-world auto insurance claims data demonstrate that our method significantly reduces intersectional bias—by an average of 37.2%—while preserving predictive accuracy. The framework is scalable, theoretically grounded in distance-based dependence measures, and offers interpretable fairness guarantees for multi-attribute fair machine learning.

Technology Category

Application Category

📝 Abstract
Ensuring equitable treatment (fairness) across protected attributes (such as gender or ethnicity) is a critical issue in machine learning. Most existing literature focuses on binary classification, but achieving fairness in regression tasks-such as insurance pricing or hiring score assessments-is equally important. Moreover, anti-discrimination laws also apply to continuous attributes, such as age, for which many existing methods are not applicable. In practice, multiple protected attributes can exist simultaneously; however, methods targeting fairness across several attributes often overlook so-called "fairness gerrymandering", thereby ignoring disparities among intersectional subgroups (e.g., African-American women or Hispanic men). In this paper, we propose a distance covariance regularisation framework that mitigates the association between model predictions and protected attributes, in line with the fairness definition of demographic parity, and that captures both linear and nonlinear dependencies. To enhance applicability in the presence of multiple protected attributes, we extend our framework by incorporating two multivariate dependence measures based on distance covariance: the previously proposed joint distance covariance (JdCov) and our novel concatenated distance covariance (CCdCov), which effectively address fairness gerrymandering in both regression and classification tasks involving protected attributes of various types. We discuss and illustrate how to calibrate regularisation strength, including a method based on Jensen-Shannon divergence, which quantifies dissimilarities in prediction distributions across groups. We apply our framework to the COMPAS recidivism dataset and a large motor insurance claims dataset.
Problem

Research questions and friction points this paper is trying to address.

Addressing fairness in regression tasks with continuous protected attributes
Mitigating intersectional subgroup disparities and fairness gerrymandering
Handling multiple protected attributes with linear and nonlinear dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distance covariance regularization for fairness
Multivariate dependence measures prevent gerrymandering
Jensen-Shannon divergence calibrates regularization strength
H
Ho Ming Lee
Centre for Actuarial Studies, Department of Economics, University of Melbourne, Australia; Faculty of Economics and Business, KU Leuven, Belgium
K
Katrien Antonio
Faculty of Economics and Business, KU Leuven, Belgium; Faculty of Economics and Business, University of Amsterdam, The Netherlands; LRisk, Leuven Research Center on Insurance and Financial Risk Analysis, KU Leuven, Belgium; LStat, Leuven Statistics Research Centre, KU Leuven, Belgium
Benjamin Avanzi
Benjamin Avanzi
Professor of Actuarial Studies, University of Melbourne
Actuarial ScienceRisk TheoryDependence modellingPensionsRisk Modelling in Operations Management
Lorenzo Marchi
Lorenzo Marchi
Faculty of Economics and Business, KU Leuven, Belgium; LRisk, Leuven Research Center on Insurance and Financial Risk Analysis, KU Leuven, Belgium
R
Rui Zhou
Centre for Actuarial Studies, Department of Economics, University of Melbourne, Australia