Does Calibration Affect Human Actions?

📅 2025-08-23

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This study investigates how classifier calibration affects decision-making behavior of non-expert users, specifically examining its impact on subjective trust and decision-prediction alignment. To bridge the gap between traditional calibration metrics and human judgment, we propose a prospect-theory-informed calibration score adjustment method that explicitly incorporates empirically grounded models of human risk preferences. Through a rigorously controlled human-AI interaction experiment, we find that improving conventional calibration alone does not significantly increase users’ self-reported trust; however, our adjusted calibration metric significantly enhances decision-prediction consistency (p < 0.01). This work constitutes the first systematic integration of prospect theory into the calibration evaluation framework. It reveals that behavioral responses—rather than subjective trust ratings—are more sensitive indicators of calibration quality, thereby uncovering a critical dissociation in the “calibration–trust–behavior” chain. Our findings provide both a novel human-centered design paradigm and empirical grounding for trustworthy AI systems.

Technology Category

Application Category

📝 Abstract

Calibration has been proposed as a way to enhance the reliability and adoption of machine learning classifiers. We study a particular aspect of this proposal: how does calibrating a classification model affect the decisions made by non-expert humans consuming the model's predictions? We perform a Human-Computer-Interaction (HCI) experiment to ascertain the effect of calibration on (i) trust in the model, and (ii) the correlation between decisions and predictions. We also propose further corrections to the reported calibrated scores based on Kahneman and Tversky's prospect theory from behavioral economics, and study the effect of these corrections on trust and decision-making. We find that calibration is not sufficient on its own; the prospect theory correction is crucial for increasing the correlation between human decisions and the model's predictions. While this increased correlation suggests higher trust in the model, responses to ``Do you trust the model more?" are unaffected by the method used.

Problem

Research questions and friction points this paper is trying to address.

Examining calibration's impact on human trust in ML models

Studying how calibrated predictions influence non-expert decision-making

Investigating prospect theory corrections for improved human-model alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Calibration combined with prospect theory correction

HCI experiment testing trust and decision correlation

Behavioral economics enhancing human-model prediction alignment

🔎 Similar Papers

Does Alignment Tuning Really Break LLMs' Internal Confidence?