🤖 AI Summary
In task-oriented dialogue, user utterances are often semantically complete yet lack system-executable structured intents—a challenge arising from the asymmetry between users’ ambiguous needs and systems’ requirement for precise intent definitions. To address this, we propose STORM, the first framework to formally model the dynamic evolution of intent triggerability under such asymmetric interaction. STORM introduces an intent formation trajectory model to characterize the cognitive progression of collaborative understanding and designs a novel evaluation metric jointly optimizing cognitive grounding and task performance. Methodologically, it employs a dual-LLM architecture (UserLLM/AgentLLM), structured trajectory annotation, and uncertainty-sensitive evaluation. Evaluated on four mainstream LLMs, STORM demonstrates that strategies incorporating 40–60% moderate uncertainty significantly outperform fully transparent ones, revealing model-specific uncertainty calibration patterns. These results provide both theoretical foundations and empirical evidence for enhancing cooperative dialogue systems through calibrated uncertainty modeling.
📝 Abstract
Task-oriented dialogue systems often face difficulties when user utterances seem semantically complete but lack necessary structural information for appropriate system action. This arises because users frequently do not fully understand their own needs, while systems require precise intent definitions. Current LLM-based agents cannot effectively distinguish between linguistically complete and contextually triggerable expressions, lacking frameworks for collaborative intent formation. We present STORM, a framework modeling asymmetric information dynamics through conversations between UserLLM (full internal access) and AgentLLM (observable behavior only). STORM produces annotated corpora capturing expression trajectories and latent cognitive transitions, enabling systematic analysis of collaborative understanding development. Our contributions include: (1) formalizing asymmetric information processing in dialogue systems; (2) modeling intent formation tracking collaborative understanding evolution; and (3) evaluation metrics measuring internal cognitive improvements alongside task performance. Experiments across four language models reveal that moderate uncertainty (40-60%) can outperform complete transparency in certain scenarios, with model-specific patterns suggesting reconsideration of optimal information completeness in human-AI collaboration. These findings contribute to understanding asymmetric reasoning dynamics and inform uncertainty-calibrated dialogue system design.