When Researchers Say Mental Model/Theory of Mind of AI, What Are They Really Talking About?

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper challenges the prevailing assumption in AI theory of mind (ToM) research that behavioral mimicry constitutes evidence of genuine mental modeling. It argues that large language models’ “human-level” performance on standard ToM tasks reflects statistical pattern matching rather than attribution of internal mental states; more fundamentally, evaluating AI using paradigms derived from individual human cognition entails a methodological flaw. To address this, the paper introduces *Reciprocal Theory of Mind*, a novel framework that shifts evaluation from isolated, unidirectional AI output to bidirectional understanding generation within real-time human–AI interaction. Methodologically, it integrates behavioral experiments with LLMs, critical analysis of cognitive science theories, and philosophical conceptual clarification. Its core contributions are: (1) deconstructing the implicit cognitive assumptions underlying current ToM assessments; (2) exposing the irreducible epistemic gap between observable behavior and subjective experience; and (3) advancing AI cognitive science toward a dynamic, interaction-centered paradigm.

Technology Category

Application Category

📝 Abstract

When researchers claim AI systems possess ToM or mental models, they are fundamentally dis- cussing behavioral predictions and bias corrections rather than genuine mental states. This position paper argues that the current discourse conflates sophisticated pattern matching with authentic cog- nition, missing a crucial distinction between simulation and experience. While recent studies show LLMs achieving human-level performance on ToM laboratory tasks, these results are based only on behavioral mimicry. More importantly, the entire testing paradigm may be flawed in applying individual human cognitive tests to AI systems, but assessing human cognition directly in the moment of human-AI interaction. I suggest shifting focus toward mutual ToM frameworks that acknowledge the simultaneous contributions of human cognition and AI algorithms, emphasizing the interaction dynamics, instead of testing AI in isolation.

Problem

Research questions and friction points this paper is trying to address.

Clarifying what researchers mean by AI mental models versus genuine cognition

Identifying flaws in applying human cognitive tests to AI systems

Proposing mutual theory of mind frameworks for human-AI interaction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Behavioral predictions and bias corrections replace mental states

Shifting from isolated AI testing to human-AI interaction dynamics

Mutual Theory of Mind frameworks acknowledge human-AI contributions

🔎 Similar Papers

No similar papers found.

Authors to Follow