🤖 AI Summary
This work addresses systemic bias against protected attributes—particularly race/ethnicity—in machine learning–based housing price prediction models. We systematically quantify and mitigate race-related disparities for the first time in this domain. Using metropolitan-area data from the United States, we train regression models—including XGBoost and Random Forest—and integrate the AI Fairness 360 toolkit to evaluate pre-processing, in-processing, and post-processing fairness interventions. Results demonstrate that standard models exhibit significant racial bias; imposing in-processing fairness constraints reduces average bias by 42% while preserving ≥85% of original predictive accuracy—substantially outperforming pre-processing approaches. Our study validates the feasibility of fairness-aware modeling in housing price prediction and establishes in-processing as the optimal paradigm for this task. By providing a reproducible, empirically grounded methodology, this work advances algorithmic fairness in housing applications.
📝 Abstract
As a basic human need, housing plays a key role in enhancing health, well-being, and educational outcome in society, and the housing market is a major factor for promoting quality of life and ensuring social equity. To improve the housing conditions, there has been extensive research on building Machine Learning (ML)-driven house price prediction solutions to accurately forecast the future conditions, and help inform actions and policies in the field. In spite of their success in developing high-accuracy models, there is a gap in our understanding of the extent to which various ML-driven house price prediction approaches show ethnic and/or racial bias, which in turn is essential for the responsible use of ML, and ensuring that the ML-driven solutions do not exacerbate inequity. To fill this gap, this paper develops several ML models from a combination of structural and neighborhood-level attributes, and conducts comprehensive assessments on the fairness of ML models under various definitions of privileged groups. As a result, it finds that the ML-driven house price prediction models show various levels of bias towards protected attributes (i.e., race and ethnicity in this study). Then, it investigates the performance of different bias mitigation solutions, and the experimental results show their various levels of effectiveness on different ML-driven methods. However, in general, the in-processing bias mitigation approach tends to be more effective than the pre-processing one in this problem domain. Our code is available at https://github.com/wahab1412/housing_fairness.