π€ AI Summary
Large language models (LLMs) suffer from hallucination in robotic task planning due to insufficient environmental perception, hindering real-world deployment. To address this, we propose BrainBody-LLMβa novel brainβbody dual-LLM hierarchical architecture inspired by biological cognition: a high-level LLM performs semantic task decomposition, while a low-level LLM generates executable action sequences conditioned on real-time state feedback. A closed-loop error learning mechanism, driven by physics-based simulation, dynamically corrects execution deviations. The framework integrates VirtualHome and PyBullet simulators and is deployed on a Franka Emika robot arm. Experiments show a 29% absolute improvement in task success rate over GPT-4 on VirtualHome; moreover, it significantly outperforms state-of-the-art LLM-based methods across seven complex physical manipulation tasks, demonstrating superior robustness and autonomous error correction in dynamic environments.
π Abstract
Planning algorithms decompose complex problems into intermediate steps that can be sequentially executed by robots to complete tasks. Recent works have employed Large Language Models (LLMs) for task planning, using natural language to generate robot policies in both simulation and real-world environments. LLMs like GPT-4 have shown promising results in generalizing to unseen tasks, but their applicability is limited due to hallucinations caused by insufficient grounding in the robot environment. The robustness of LLMs in task planning can be enhanced with environmental state information and feedback. In this paper, we introduce a novel approach to task planning that utilizes two separate LLMs for high-level planning and low-level control, improving task-related success rates and goal condition recall. Our algorithm, extit{BrainBody-LLM}, draws inspiration from the human neural system, emulating its brain-body architecture by dividing planning across two LLMs in a structured, hierarchical manner. BrainBody-LLM implements a closed-loop feedback mechanism, enabling learning from simulator errors to resolve execution errors in complex settings. We demonstrate the successful application of BrainBody-LLM in the VirtualHome simulation environment, achieving a 29% improvement in task-oriented success rates over competitive baselines with the GPT-4 backend. Additionally, we evaluate our algorithm on seven complex tasks using a realistic physics simulator and the Franka Research 3 robotic arm, comparing it with various state-of-the-art LLMs. Our results show advancements in the reasoning capabilities of recent LLMs, which enable them to learn from raw simulator/controller errors to correct plans, making them highly effective in robotic task planning.