seeBias: A Comprehensive Tool for Assessing and Visualizing AI Fairness

📅 2025-04-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AI fairness tools predominantly assess fairness through classification performance disparities, neglecting critical dimensions such as predictive calibration—limiting their suitability for high-stakes domains like healthcare and criminal justice. To address this gap, we introduce *seeBias*, an open-source R package that unifies classification fairness, predictive calibration, and multidimensional performance evaluation within a single framework. It supports comprehensive diagnostics—including confusion matrix analysis, reliability diagrams, inter-group calibration disparity metrics, and fairness heatmaps—with customizable, publication-ready visualizations. Empirical evaluation on criminal justice and clinical datasets demonstrates that *seeBias* effectively uncovers latent biases: models with comparable accuracy may exhibit severe miscalibration across demographic groups, a deficiency invisible to conventional fairness metrics. The tool is publicly available on GitHub; a Python implementation is under development.

Technology Category

Application Category

📝 Abstract
Fairness in artificial intelligence (AI) prediction models is increasingly emphasized to support responsible adoption in high-stakes domains such as health care and criminal justice. Guidelines and implementation frameworks highlight the importance of both predictive accuracy and equitable outcomes. However, current fairness toolkits often evaluate classification performance disparities in isolation, with limited attention to other critical aspects such as calibration. To address these gaps, we present seeBias, an R package for comprehensive evaluation of model fairness and predictive performance. seeBias offers an integrated evaluation across classification, calibration, and other performance domains, providing a more complete view of model behavior. It includes customizable visualizations to support transparent reporting and responsible AI implementation. Using public datasets from criminal justice and healthcare, we demonstrate how seeBias supports fairness evaluations, and uncovers disparities that conventional fairness metrics may overlook. The R package is available on GitHub, and a Python version is under development.
Problem

Research questions and friction points this paper is trying to address.

Assessing AI fairness beyond classification disparities
Evaluating model fairness and predictive performance comprehensively
Identifying overlooked disparities in AI fairness metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

R package for comprehensive AI fairness evaluation
Integrated assessment across classification and calibration
Customizable visualizations for transparent reporting
🔎 Similar Papers
No similar papers found.
Yilin Ning
Yilin Ning
Senior Research Fellow, Centre for Quantitative Medicine, Duke-NUS Medical School
Fair and Ethical AIBiostatisticsEpidemiologyStatistical programming
Y
Yian Ma
Centre for Quantitative Medicine, Duke -NUS Medical School, Singapore, Singapore; Duke -NUS AI + Medical Sciences Initiative, Duke -NUS Medical School, Singapore, Singapore
M
Mingxuan Liu
Centre for Quantitative Medicine, Duke -NUS Medical School, Singapore, Singapore; Duke -NUS AI + Medical Sciences Initiative, Duke -NUS Medical School, Singapore, Singapore
X
Xin Li
Centre for Quantitative Medicine, Duke -NUS Medical School, Singapore, Singapore; Duke -NUS AI + Medical Sciences Initiative, Duke -NUS Medical School, Singapore, Singapore
N
Nan Liu
Centre for Quantitative Medicine, Duke -NUS Medical School, Singapore, Singapore; Duke -NUS AI + Medical Sciences Initiative, Duke -NUS Medical School, Singapore, Singapore; Programme in Health Services and Systems Research, Duke -NUS Medical School, Singapore, Singapore; NUS Artificial Intelligence Institute, National University of Singapore, Singapore, Singapore