🤖 AI Summary
To address the unreliability of long-horizon task execution by embodied agents in dynamic environments, this paper proposes a closed-loop embodied agent architecture. The method leverages environment memory to drive interactive task planning, integrates multimodal perception–action joint evaluation, and employs probabilistic action feasibility criteria to enable robust subtask sequencing and disturbance-aware online replanning. It introduces, for the first time, a four-model functional decoupling framework that explicitly separates environment memory modeling, hierarchical planning, execution critique, and feedback regulation. Evaluated on a real-world robotic platform across 12 search-and-manipulation composite tasks, the approach achieves a 67.3% improvement in success rate and a 52.8% increase in completion rate, significantly enhancing task robustness and generalization capability under dynamic conditions.
📝 Abstract
Large Language Models (LLMs) exhibit remarkable capabilities in the hierarchical decomposition of complex tasks through semantic reasoning. However, their application in embodied systems faces challenges in ensuring reliable execution of subtask sequences and achieving one-shot success in long-term task completion. To address these limitations in dynamic environments, we propose Closed-Loop Embodied Agent (CLEA) -- a novel architecture incorporating four specialized open-source LLMs with functional decoupling for closed-loop task management. The framework features two core innovations: (1) Interactive task planner that dynamically generates executable subtasks based on the environmental memory, and (2) Multimodal execution critic employing an evaluation framework to conduct a probabilistic assessment of action feasibility, triggering hierarchical re-planning mechanisms when environmental perturbations exceed preset thresholds. To validate CLEA's effectiveness, we conduct experiments in a real environment with manipulable objects, using two heterogeneous robots for object search, manipulation, and search-manipulation integration tasks. Across 12 task trials, CLEA outperforms the baseline model, achieving a 67.3% improvement in success rate and a 52.8% increase in task completion rate. These results demonstrate that CLEA significantly enhances the robustness of task planning and execution in dynamic environments.