🤖 AI Summary
This work addresses the susceptibility of long-horizon agents to irreversible failures stemming from early erroneous assumptions, a challenge exacerbated by the absence of quantitative understanding regarding the timing of clarification. The authors propose a forced-injection framework that systematically introduces ground-truth clarifications—pertaining to goals, inputs, constraints, and context—along task execution trajectories. Through extensive experimentation across three agent benchmarks, four state-of-the-art models, 84 task variants, and over 6,000 trials, complemented by 300 unscripted human-agent interactions, they reveal for the first time that the value of clarification varies dynamically with task progress: goal clarification becomes nearly ineffective beyond 10% completion, while input clarification remains beneficial up to 50%, with overly delayed clarifications actually degrading performance. Critically, all examined models fail to ask clarifying questions within their optimal windows, overturning the common intuition that “earlier is always better” and establishing intrinsic task-dependent clarification demand curves.
📝 Abstract
Long-horizon AI agents execute complex workflows spanning hundreds of sequential actions, yet a single wrong assumption early on can cascade into irreversible errors. When instructions are incomplete, the agent must decide not only whether to ask for clarification but when, and no prior work measures how clarification value changes over the course of execution. We introduce a forced-injection framework that provides ground-truth clarifications at controlled points in the agent's trajectory across four information dimensions (goal, input, constraint, context), three agent benchmarks, and four frontier models (three per benchmark; one on a single benchmark only; 84 task variants; 6,000+ runs). Counter to the common intuition that "earlier is always better," we find that the value of clarification depends sharply on what information is missing: goal clarification loses nearly all value after 10% of execution (pass@3 drops from 0.78 to baseline), while input clarification retains value through roughly 50%. Deferring any clarification type past mid-trajectory degrades performance below never asking at all. Cross-model Kendall tau correlations (0.78-0.87 among models sharing identical task coverage; 0.34-0.67 across the full 4-model panel) confirm these timing profiles are substantially task-intrinsic. A complementary study of 300 unscripted sessions reveals that no current frontier model asks within the empirically optimal window, with strategies ranging from over-asking (52% of sessions) to never asking at all. These empirical demand curves provide the quantitative foundation that existing theoretical frameworks require but have lacked, and establish concrete design targets for timing-aware clarification policies. Code and data will be publicly released.