A Benchmark to Assess Common Ground in Human-AI Collaboration

📅 2026-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing human-AI collaboration lacks effective evaluation of shared grounding—the mutual understanding essential for seamless coordination—hindering AI’s evolution into a genuine collaborative partner. Drawing on theories of human collaboration, this work introduces a puzzle-solving benchmark that operationalizes shared grounding through iterative interaction, joint action, referential coordination, and repair of misunderstandings. The framework systematically assesses human-AI shared grounding under varying levels of situational awareness. This study is the first to integrate shared grounding theory into human-AI collaboration evaluation, offering a standardized benchmark that balances ecological validity with reproducibility. Through cognitively informed experimental design and quantitative behavioral analysis, user studies not only replicate classic phenomena observed in human-human collaboration but also uncover distinctive patterns and critical challenges unique to human-AI teamwork.

Technology Category

Application Category

📝 Abstract
AI is becoming increasingly integrated into everyday life, both in professional work environments and in leisure and entertainment contexts. This integration requires AI to move beyond acting as an assistant for informational or transactional tasks toward a genuine collaborative partner. Effective collaboration, whether between humans or between humans and AI, depends on establishing and maintaining common ground: shared beliefs, assumptions, goals, and situational awareness that enable coordinated action and efficient repair of misunderstandings. While common ground is a central concept in human collaboration, it has received limited attention in studies of human-AI collaboration. In this paper, we introduce a new benchmark grounded in theories and empirical studies of human-human collaboration. The benchmark is based on a collaborative puzzle task that requires iterative interaction, joint action, referential coordination, and repair under varying conditions of situation awareness. We validate the benchmark through a confirmatory user study in which human participants collaborate with an AI to solve the task. The results show that the benchmark reproduces established theoretical and empirical findings from human-human collaboration, while also revealing clear divergences in human-AI interaction.
Problem

Research questions and friction points this paper is trying to address.

common ground
human-AI collaboration
benchmark
collaborative interaction
situation awareness
Innovation

Methods, ideas, or system contributions that make the work stand out.

common ground
human-AI collaboration
collaborative benchmark
referential coordination
situation awareness
🔎 Similar Papers
No similar papers found.