Binary Choice with Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Racial Justice

📅 2020-10-01

📈 Citations: 9

✨ Influential: 2

career value

191K/year

🤖 AI Summary

This paper addresses binary decision problems under asymmetric misclassification costs—such as racially disparate errors in pretrial detention—in data-rich settings. Methodologically, it establishes, for the first time, a rigorous theoretical equivalence: the optimal binary predictor under any asymmetric loss is attained by applying covariate-dependent sample reweighting to standard models (e.g., logistic regression, gradient boosting, deep neural networks), thereby reducing complex loss optimization to a tractable weight adjustment step. This unifies econometric theoretical guarantees with scalable machine learning implementations. Empirically, applied to pretrial detention decisions, the method significantly reduces high-cost misclassifications for minority defendants while preserving predictive accuracy and policy-relevant fairness metrics. The framework yields an interpretable, deployable, and fairness-aware decision tool for high-stakes domains.

📝 Abstract

The importance of asymmetries in prediction problems arising in economics has been recognized for a long time. In this paper, we focus on binary choice problems in a data-rich environment with general loss functions. In contrast to the asymmetric regression problems, the binary choice with general loss functions and high-dimensional datasets is challenging and not well understood. Econometricians have studied binary choice problems for a long time, but the literature does not offer computationally attractive solutions in data-rich environments. In contrast, the machine learning literature has many computationally attractive algorithms that form the basis for much of the automated procedures that are implemented in practice, but it is focused on symmetric loss functions that are independent of individual characteristics. One of the main contributions of our paper is to show that the theoretically valid predictions of binary outcomes with arbitrary loss functions can be achieved via a very simple reweighting of the logistic regression, or other state-of-the-art machine learning techniques, such as boosting or (deep) neural networks. We apply our analysis to racial justice in pretrial detention.

Problem

Research questions and friction points this paper is trying to address.

Addressing binary choice with asymmetric loss in data-rich settings

Bridging econometric theory and machine learning computational methods

Developing fair algorithmic decisions for pretrial detention applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reweighting logistic regression for asymmetric loss

Applying machine learning with covariate-dependent loss

Ensuring theoretical validity in binary choice decisions

🔎 Similar Papers

Long-Term Fairness Inquiries and Pursuits in Machine Learning: A Survey of Notions, Methods, and Challenges

2024-06-10arXiv.orgCitations: 3

Intuit

Oakland, California

2026 Fall Applied Science Internship - Reinforcement Learning & Optimization (Machine Learning) - United States, PhD Student Science Recruiting

Amazon

Arlington, VA, USA / Bellevue, WA, USA / Boston, MA, USA

Machine Learning Engineer