MindForge: Empowering Embodied Agents with Theory of Mind for Lifelong Collaborative Learning

📅 2024-11-20
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited collaborative lifelong learning capability of embodied agents based on open-source large language models (LLMs) in open-world environments, this paper proposes MindForge. Methodologically, MindForge explicitly models others’ beliefs, desires, and intentions (BDI) to construct structured theory-of-mind representations; introduces natural-dialogue-driven multi-agent coordination; and implements a hierarchical memory system—comprising episodic, semantic, and procedural memory—to support long-term social learning. Crucially, MindForge relies exclusively on open-weight LLMs, requiring no proprietary models or additional training data. Empirical evaluation in Minecraft demonstrates that MindForge achieves 2.3× more unique item acquisitions and 3× more tech-tree milestone completions compared to Voyager. Moreover, it is the first framework to enable expert-to-novice knowledge transfer, collaborative problem solving, and out-of-distribution (OOD) environmental adaptation—establishing new capabilities for open-world, socially grounded agent learning.

Technology Category

Application Category

📝 Abstract
Contemporary embodied agents powered by large language models (LLMs), such as Voyager, have shown promising capabilities in individual learning within open-ended environments like Minecraft. However, when powered by open LLMs, they struggle with basic tasks even after domain-specific fine-tuning. We present MindForge, a generative-agent framework for collaborative lifelong learning through explicit perspective taking. We introduce three key innovations: (1) a structured theory of mind representation linking percepts, beliefs, desires, and actions; (2) natural interagent communication; and (3) a multicomponent memory system. In Minecraft experiments, MindForge agents powered by open-weight LLMs significantly outperform their Voyager counterparts in basic tasks where traditional Voyager fails without GPT-4, collecting $2.3 imes$ more unique items and achieving $3 imes$ more tech-tree milestones, advancing from basic wood tools to advanced iron equipment. MindForge agents demonstrate sophisticated behaviors, including expert-novice knowledge transfer, collaborative problem solving, and adaptation to out-of-distribution tasks through accumulated collaborative experiences. MindForge advances the democratization of embodied AI development through open-ended social learning, enabling peer-to-peer knowledge sharing.
Problem

Research questions and friction points this paper is trying to address.

Empowering agents with theory of mind
Enhancing collaborative lifelong learning
Improving performance in basic tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

structured theory of mind
natural interagent communication
multicomponent memory system
M
Mircea Lica
Delft University of Technology
O
O. Shirekar
Delft University of Technology
B
Baptiste Colle
Delft University of Technology
Chirag Raman
Chirag Raman
Delft University of Technology
Multimodal Machine LearningComputer VisionHuman-Computer Interaction