PARC: An Autonomous Self-Reflective Coding Agent for Robust Execution of Long-Horizon Tasks

📅 2025-12-03

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

This work addresses the lack of autonomous error correction and strategic adaptation in long-horizon computational tasks executed by AI coding agents. We propose PARC, a hierarchical multi-agent coding framework featuring introspective collaboration—integrating high-level task planning, distributed execution monitoring, automated error detection and recovery, and cross-context self-assessment with feedback to identify and rectify high-level strategic failures. PARC autonomously executes complex scientific computing workflows without human intervention: it successfully reproduces lithium-ion conduction and alloy segregation studies in materials science, sustaining stable management of数十 concurrent simulations lasting up to 43 hours; and in Kaggle competitions, it generates competitive end-to-end solutions directly from natural-language instructions. Experiments demonstrate significant improvements in robustness, autonomy, and cross-domain generalization compared to prior approaches.

Technology Category

Application Category

📝 Abstract

We introduce PARC, a coding agent for the autonomous and robust execution of long-horizon computational tasks. PARC is built on a hierarchical multi-agent architecture incorporating task planning, execution, and a mechanism that evaluates its own actions and their outcomes from an independent context and provides feedback, namely self-assessment and self-feedback. This design enables PARC to detect and correct high-level strategic errors and sustain progress without human intervention. We evaluate PARC across computational science and data science tasks. In materials science, it autonomously reproduces key results from studies on lithium-ion conduction and alloy segregation. In particular, it coordinates dozens of parallel simulation tasks, each requiring roughly 43 hours of computation, managing orchestration, monitoring, and error correction end-to-end. In Kaggle-based experiments, starting from minimal natural-language instructions, PARC conducts data analysis and implements search strategies, producing solutions competitive with human-engineered baselines. These results highlight the potential of integrating a hierarchical multi-agent system with self-assessment and self-feedback to enable AI systems capable of independent, large-scale scientific and analytical work.

Problem

Research questions and friction points this paper is trying to address.

Autonomous execution of long computational tasks

Self-assessment and correction of strategic errors

Integration of hierarchical agents for scientific work

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical multi-agent system for task planning and execution

Self-assessment and feedback mechanism for error correction

Autonomous coordination of parallel simulations and data analysis

🔎 Similar Papers

Long-Horizon Planning for Multi-Agent Robots in Partially Observable Environments