🤖 AI Summary
To address the limited long-term learning capability and poor cross-task generalization of LLM-based agents in dynamic environments, this paper proposes OmniReflect—a novel framework featuring a hierarchical neuro-symbolic reflection mechanism. It introduces transferable “constitutions” (i.e., guiding principles) to LLM agents for the first time, enabling cross-task and cross-model knowledge accumulation and reuse. OmniReflect supports two reflection modes: self-sustaining reflection—enhancing individual agent’s continual optimization—and collaborative reflection—improving multi-agent co-adaptation. The method integrates ReAct-style reasoning, symbolic logic, and a meta-advisor mechanism for adaptive guidance. Experiments on ALFWorld, BabyAI, and PDDL benchmarks demonstrate that the self-sustaining mode improves success rates by +10.3%, +23.8%, and +8.3%, respectively; under collaborative reflection, Qwen3-4B significantly outperforms all Reflexion baselines.
📝 Abstract
Efforts to improve Large Language Model (LLM) agent performance on complex tasks have largely focused on fine-tuning and iterative self-correction. However, these approaches often lack generalizable mechanisms for longterm learning and remain inefficient in dynamic environments. We introduce OmniReflect, a hierarchical, reflection-driven framework that constructs a constitution, a compact set of guiding principles distilled from task experiences, to enhance the effectiveness and efficiency of an LLM agent. OmniReflect operates in two modes: Self-sustaining, where a single agent periodically curates its own reflections during task execution, and Co-operative, where a Meta-advisor derives a constitution from a small calibration set to guide another agent. To construct these constitutional principles, we employ Neural, Symbolic, and NeuroSymbolic techniques, offering a balance between contextual adaptability and computational efficiency. Empirical results averaged across models show major improvements in task success, with absolute gains of +10.3% on ALFWorld, +23.8% on BabyAI, and +8.3% on PDDL in the Self-sustaining mode. Similar gains are seen in the Co-operative mode, where a lightweight Qwen3-4B ReAct agent outperforms all Reflexion baselines on BabyAI. These findings highlight the robustness and effectiveness of OmniReflect across environments and backbones.