Auditing a Dutch Public Sector Risk Profiling Algorithm Using an Unsupervised Bias Detection Tool

📅 2025-02-03

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper addresses the challenge of detecting algorithmic bias in sensitive demographic data with missing values, proposing an unsupervised bias auditing framework to assess the fairness of the Dutch Ministry of Education’s student risk-scoring algorithm (2012–2023). Methodologically, it integrates K-means clustering with silhouette analysis to identify potentially discriminatory subgroups, employs counterfactual simulation and fairness sensitivity analysis to establish causal bias, and incorporates a privacy-compliant auditing module. Contributions include: (1) the first application of unsupervised bias detection in a real-world, high-stakes governmental decision-making context; (2) a reproducible simulation-based validation framework that mitigates false-positive bias identification; and (3) an open-source, lightweight bias detection toolkit. Empirical analysis reveals systematic risk overestimation for students with non-European migrant backgrounds—impacting over 250,000 individuals—yielding interpretable evidence and actionable guidelines for regulatory oversight and human-in-the-loop intervention.

Technology Category

Application Category

📝 Abstract

Algorithms are increasingly used to automate or aid human decisions, yet recent research shows that these algorithms may exhibit bias across legally protected demographic groups. However, data on these groups may be unavailable to organizations or external auditors due to privacy legislation. This paper studies bias detection using an unsupervised clustering tool when data on demographic groups are unavailable. We collaborate with the Dutch Executive Agency for Education to audit an algorithm that was used to assign risk scores to college students at the national level in the Netherlands between 2012-2023. Our audit covers more than 250,000 students from the whole country. The unsupervised clustering tool highlights known disparities between students with a non-European migration background and Dutch origin. Our contributions are three-fold: (1) we assess bias in a real-world, large-scale and high-stakes decision-making process by a governmental organization; (2) we use simulation studies to highlight potential pitfalls of using the unsupervised clustering tool to detect true bias when demographic group data are unavailable and provide recommendations for valid inferences; (3) we provide the unsupervised clustering tool in an open-source library. Our work serves as a starting point for a deliberative assessment by human experts to evaluate potential discrimination in algorithmic-supported decision-making processes.

Problem

Research questions and friction points this paper is trying to address.

Unsupervised bias detection without demographic data

Auditing Dutch education risk profiling algorithm

Addressing biases in high-stakes governmental decisions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised clustering for bias detection

Simulation studies validate tool efficacy

Open-source release of detection tool

🔎 Similar Papers

No similar papers found.

Authors to Follow