π€ AI Summary
This study addresses the challenge of evaluating tacit understanding in humanβAI collaboration, which conventional explicit task metrics fail to capture, particularly in the absence of instructions or feedback. To this end, the authors propose the Tacit Understanding Index (TUX), a novel measure derived from a spectrum-based conceptual alignment task inspired by the social game *Wavelength*. TUX quantifies the degree of implicit alignment between humans and large language models (LLMs) when independently positioning subjective concepts along continuous dimensions. Through experiments involving 241 human participants and 200 trait-conditioned LLM agents, the research demonstrates that proximity in individual trait space significantly predicts TUX scores. Moreover, a model integrating traits, decision styles, and confidence outperforms baseline approaches relying solely on trait distance, underscoring the critical role of individual differences in shaping tacit humanβAI understanding.
π Abstract
As large language models (LLMs) increasingly act as collaborative partners, human--AI alignment is often evaluated through explicit task success, accuracy, or reward optimization. Yet many collaborative settings depend on tacit understanding: whether an agent can align with a human's evaluative stance or representational priors without clear objectives, communication, or feedback. To study this capacity, we develop a spectrum-placement task inspired by the social party game Wavelength, in which humans and agents independently place concepts along subjective spectra. We operationalize the Tacit Understanding Index (TUX) as a pairwise measure of similarity between human and agent judgments, and evaluate it with 241 human participants and 200 profile-conditioned LLM agents across four models. We find that nearest human--agent pairs in trait space achieve significantly higher TUX, suggesting that tacit alignment is structured by person-level characteristics rather than random similarity. Regression analyses show that TUX becomes more explainable as predictor sets become richer, with individual traits, decision-making styles, and confidence improving over aggregate trait-distance baselines. These findings suggest that tacit understanding between humans and LLMs is measurable, while revealing the limits of profile-based conditioning for capturing deeper representational alignment.