MindClaw: Closed-Loop Embodied Mental-State Reasoning for Precision Intervention

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the limitations of existing Theory of Mind (ToM) approaches, which struggle to determine optimal intervention timing in dynamic environments and lack closed-loop reasoning about belief updates and intervention necessity. The authors propose MindClaw, a novel framework that introduces ToM into real-time, closed-loop embodied settings for the first time. MindClaw integrates multimodal perception, belief memory, a learnable cognitive triggering mechanism, and mental state inference to enable on-demand intervention rather than continuous output. Evaluated against vision-language model baselines, the framework demonstrates superior performance in task awareness and intervention calibration, highlighting the critical role of the cognitive trigger in enhancing the precision and practicality of human-centered assistance.

📝 Abstract

Theory of Mind (ToM) enables an agent to reason about another actor's beliefs, goals, and intentions, which is essential for human-centered embodied assistance. Existing ToM benchmarks have advanced text and multimodal mental-state recognition, but they mostly evaluate offline question answering or final action prediction. They do not fully test whether an embodied agent can stay connected to a changing environment, update actor-specific beliefs, decide when reasoning is needed, and intervene only when help is useful. Building on MindPower, we extend robot-centric ToM reasoning to a real-time closed-loop setting and introduce MindClaw, a framework for embodied mental-state reasoning with precision intervention. MindClaw connects multi-source inputs, belief memory, an embodied cognitive trigger skill, mental reasoning, and action generation, allowing the agent to output helpful actions at the right time while remaining silent when intervention is unnecessary. Experiments show that direct VLM baselines struggle with task awareness and intervention calibration, while MindClaw achieves the best overall performance, demonstrating the importance of trigger-skill optimization for closed-loop embodied ToM assistance.

Problem

Research questions and friction points this paper is trying to address.

Theory of Mind

embodied reasoning

closed-loop intervention

mental-state recognition

precision assistance

Innovation

Methods, ideas, or system contributions that make the work stand out.

closed-loop embodied reasoning

Theory of Mind

precision intervention