🤖 AI Summary
This work addresses critical limitations in existing memory-augmented generative systems—namely semantic drift, poor multi-agent coordination, and low fault tolerance—stemming from their reliance on external services. To overcome these issues, the authors propose an agent-native memory architecture in which a single large language model autonomously manages the organization, structuring, and retrieval of its own memory without external databases or embedding services. The approach features a hierarchical context tree grounded in a file system (Domain–Topic–Subtopic–Entry), explicit relationship and provenance tracking, adaptive knowledge lifecycle management, and a five-level progressive retrieval strategy, with all knowledge stored locally in Markdown format. Evaluated on LoCoMo, the method achieves state-of-the-art accuracy and demonstrates strong performance on LongMemEval, maintaining query latency below 100 ms.
📝 Abstract
Memory-Augmented Generation (MAG) extends large language models with external memory to support long-context reasoning, but existing approaches universally treat memory as an external service that agents call into, delegating storage to separate pipelines of chunking, embedding, and graph extraction. This architectural separation means the system that stores knowledge does not understand it, leading to semantic drift between what the agent intended to remember and what the pipeline actually captured, loss of coordination context across agents, and fragile recovery after failures. In this paper, we propose ByteRover, an agent-native memory architecture that inverts the memory pipeline: the same LLM that reasons about a task also curates, structures, and retrieves knowledge. ByteRover represents knowledge in a hierarchical Context Tree, a file-based knowledge graph organized as Domain, Topic, Subtopic, and Entry, where each entry carries explicit relations, provenance, and an Adaptive Knowledge Lifecycle (AKL) with importance scoring, maturity tiers, and recency decay. Retrieval uses a 5-tier progressive strategy that resolves most queries at sub-100 ms latency without LLM calls, escalating to agentic reasoning only for novel questions. Experiments on LoCoMo and LongMemEval demonstrate that ByteRover achieves state-of-the-art accuracy on LoCoMo and competitive results on LongMemEval while requiring zero external infrastructure, no vector database, no graph database, no embedding service, with all knowledge stored as human-readable markdown files on the local filesystem.