🤖 AI Summary
This work addresses the challenges of context overload and information loss in long-horizon large language model (LLM) reasoning caused by centralized memory architectures. Inspired by the complementary roles of the human prefrontal cortex and hippocampus, the authors propose a heterogeneous framework that decouples memory from reasoning: a high-level planner performs inference based on semantic summaries, while a lightweight, distributed memory system actively accumulates and integrates these summaries in parallel. This approach introduces distributed active memory into LLM reasoning for the first time, achieving functional separation between memory management and reasoning. Key innovations—including semantic summary distillation, distributed storage, and parallel memory consolidation—enable state-of-the-art accuracy on the BrowseComp-Plus and GAIA benchmarks while substantially reducing computational overhead.
📝 Abstract
Memory is essential for enabling large language model (LLM) agents to handle long-horizon reasoning tasks. Existing memory mechanisms are largely centralized, typically organizing retrieved information and interaction history within a single model context. This design imposes a fundamental trade-off: scaling reasoning trajectories risks context overload, whereas aggressive content pruning may result in irreversible information loss. Seeking a better trade-off, we draw inspiration from human cognitive systems, especially the functional complementarity between the prefrontal cortex (executive control) and the hippocampus (memory management), suggesting that such a trade-off need not be inherent, but may instead stem from centralized memory organization. To this end, we propose ActiveMem, a heterogeneous framework that decouples agent memory from the core reasoning process. Specifically, a high-level Planner utilizes distilled semantic gists to execute reasoning, while a lightweight, distributed memory system operates in parallel to actively accumulate and consolidate these gists throughout the task. Experiments on BrowseComp-Plus and GAIA show that ActiveMem achieves state-of-the-art accuracy with significantly reduced overhead, demonstrating the effectiveness of distributed active memory for long-horizon reasoning.