Testing Decision Makers without Counterfactuals

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

This study addresses the problem of identifying which agent—either a decision-maker or an advisor—possesses more information in settings where only observed actions and outcomes are available. Focusing on multi-armed bandit environments, the work proposes the first scoring test framework capable of distinguishing the better-informed agent without requiring counterfactual information. By integrating game-theoretic modeling with theoretical analysis, the paper characterizes the identifiability boundaries under both simultaneous and sequential decision-making protocols. The key contributions reveal that, in the simultaneous setting, an effective scoring test exists; however, in the sequential setting, no scoring test can simultaneously guarantee accurate identification and achieve social welfare exceeding 50% of the optimum, thereby exposing a fundamental trade-off between information identification and welfare maximization.

📝 Abstract

A decision-maker (DM) repeatedly makes choices under uncertainty in a bandit environment, where only the realization of the chosen arm is observed. Another competing agent, the adviser (AD), repeatedly provides recommendations, but the realizations of these recommendations are unobserved unless they coincide with the DM's choice. Both agents possess partial information about the arms' realizations. The central question we focus on is whether, in the long run, an outside observer can identify which agent is more informed based solely on the observed decisions, recommendations, and arm realizations. A test selects one of the agents based on the observed data. We focus primarily on the class of scoring tests, which assign a numerical score to each observation and select the agent according to the average score. We study strategic agents whose objective is to be selected by the test. For simultaneous arm choices, we show that there exists a scoring test that successfully identifies the more-informed agent. For sequential arm choices, however, no such scoring test exists. Finally, we explore the tension between identifying the more-informed agent and maximizing welfare. A DM whose objective is to pass the test may not necessarily make welfare-maximizing decisions. In a binary-arm environment, we show that no scoring test can simultaneously identify the more informed agent and achieve more than half of the welfare attained by welfare-maximizing decisions.

Problem

Research questions and friction points this paper is trying to address.

information identification

bandit environment

decision maker

adviser

counterfactual-free testing

Innovation

Methods, ideas, or system contributions that make the work stand out.

scoring test

bandit environment

information identification