Growing with Your Embodied Agent: A Human-in-the-Loop Lifelong Code Generation Framework for Long-Horizon Manipulation Skills

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM-driven robotic manipulation code generation methods face three key challenges: high noise in instruction translation, limited primitives and contextual awareness, and difficulty in reusing closed-loop feedback knowledge for long-horizon tasks—often leading to catastrophic forgetting. This paper proposes a human-in-the-loop lifelong code generation framework. Its core contributions are: (1) automatically encoding human corrections into structured, reusable skills; (2) integrating external memory with prompt-augmented retrieval-augmented generation (RAG) to enable dynamic retrieval and generalized invocation of correction knowledge; and (3) supporting continual learning and stable execution for ultra-long-horizon tasks (>20 steps). Evaluated across multiple simulation and real-world settings, our approach achieves a task success rate of 0.93 (up to +27% over baselines) and improves correction efficiency by 42% in terms of required refinement rounds, significantly outperforming state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs)-based code generation for robotic manipulation has recently shown promise by directly translating human instructions into executable code, but existing methods remain noisy, constrained by fixed primitives and limited context windows, and struggle with long-horizon tasks. While closed-loop feedback has been explored, corrected knowledge is often stored in improper formats, restricting generalization and causing catastrophic forgetting, which highlights the need for learning reusable skills. Moreover, approaches that rely solely on LLM guidance frequently fail in extremely long-horizon scenarios due to LLMs' limited reasoning capability in the robotic domain, where such issues are often straightforward for humans to identify. To address these challenges, we propose a human-in-the-loop framework that encodes corrections into reusable skills, supported by external memory and Retrieval-Augmented Generation with a hint mechanism for dynamic reuse. Experiments on Ravens, Franka Kitchen, and MetaWorld, as well as real-world settings, show that our framework achieves a 0.93 success rate (up to 27% higher than baselines) and a 42% efficiency improvement in correction rounds. It can robustly solve extremely long-horizon tasks such as "build a house", which requires planning over 20 primitives.
Problem

Research questions and friction points this paper is trying to address.

LLM-based code generation struggles with noisy outputs and fixed primitives for robotic manipulation
Existing methods suffer from catastrophic forgetting and cannot generalize corrected knowledge effectively
Current approaches fail in extremely long-horizon tasks due to limited reasoning capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-in-the-loop lifelong code generation framework
Encodes corrections into reusable skills with external memory
Uses Retrieval-Augmented Generation with hint mechanism
🔎 Similar Papers
No similar papers found.
Y
Yuan Meng
School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
Z
Zhenguo Sun
Beijing Academy of Artificial Intelligence (BAAI), Beijing, China
M
Max Fest
School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
Xukun Li
Xukun Li
Kansas State University
computer visionmachine learningdeep learningstatistical modeling
Zhenshan Bing
Zhenshan Bing
Nanjing University / Technical University of Munich
Robotics
Alois Knoll
Alois Knoll
Technische Universität München
RoboticsAISensor Data FusionAutonomous DrivingCyber Physical Systems