🤖 AI Summary
Record linkage exhibits systematic disparities in score distributions across demographic groups (e.g., gender, race), persistently compromising downstream fairness across all decision thresholds. To address this, we propose a post-hoc score calibration framework grounded in regression fairness principles—extending such fairness metrics to full-threshold analysis in record linkage for the first time. Our approach is model-agnostic: it leverages optimal transport and the Wasserstein barycenter to recalibrate matching scores without relying on internal model structure. Further, we introduce a conditional calibration mechanism that jointly optimizes Demographic Parity (DP), Equal Opportunity (EO), and Equalized Odds (EOD). Evaluated on multiple benchmark datasets and state-of-the-art matching models, our method significantly reduces DP disparity while simultaneously improving EO and EOD scores. The framework delivers an interpretable, general-purpose, and computationally efficient solution to group fairness in record linkage.
📝 Abstract
Record matching, the task of identifying records that correspond to the same real-world entities across databases, is critical for data integration in domains like healthcare, finance, and e-commerce. While traditional record matching models focus on optimizing accuracy, fairness issues, such as demographic disparities in model performance, have attracted increasing attention. Biased outcomes in record matching can result in unequal error rates across demographic groups, raising ethical and legal concerns. Existing research primarily addresses fairness at specific decision thresholds, using bias metrics like Demographic Parity (DP), Equal Opportunity (EO), and Equalized Odds (EOD) differences. However, threshold-specific metrics may overlook cumulative biases across varying thresholds. In this paper, we adapt fairness metrics traditionally applied in regression models to evaluate cumulative bias across all thresholds in record matching. We propose a novel post-processing calibration method, leveraging optimal transport theory and Wasserstein barycenters, to balance matching scores across demographic groups. This approach treats any matching model as a black box, making it applicable to a wide range of models without access to their training data. Our experiments demonstrate the effectiveness of the calibration method in reducing demographic parity difference in matching scores. To address limitations in reducing EOD and EO differences, we introduce a conditional calibration method, which empirically achieves fairness across widely used benchmarks and state-of-the-art matching methods. This work provides a comprehensive framework for fairness-aware record matching, setting the foundation for more equitable data integration processes.