Agent Memory: Characterization and System Implications of Stateful Long-Horizon Workloads

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This work addresses the efficiency challenges of cross-session memory management in large language model agents during long-horizon tasks, a domain where existing systems lack systematic understanding of memory mechanisms. The study introduces, for the first time, a four-dimensional taxonomy for agent memory and a phase-aware performance profiling framework. Leveraging two benchmark suites, the authors empirically evaluate ten representative systems, uncovering cost distribution patterns across memory construction, retrieval, and generation phases. The analysis quantifies trade-offs in write and read path overheads across different architectures and distills ten system design principles encompassing construction scheduling, capability baselines, query amortization, freshness–latency balance, and cluster management.

📝 Abstract

LLM agents are increasingly deployed on long-horizon tasks requiring sustained reasoning over extended interaction histories. Realizing this at scale requires agents to persistently store, retrieve, and update their own memory across sessions. A rich ecosystem of agent memory systems has emerged spanning flat retrieval, LLM-mediated extraction, consolidating fact stores, and agentic control flows. Yet, their system-level behavior remains uncharacterized. We present the first systems characterization of agent memory. First, we introduce a system-oriented taxonomy classifying agent memory systems along four axes. Second, we build a phase-aware profiling harness attributing cost to construction, retrieval, and generation. Third, we characterize ten representative systems across two benchmark suites, uncovering how design choices shift cost across the write and read paths. Finally, we derive 10 system recommendations covering construction scheduling, capability floors, amortization via query volume, freshness-latency tradeoffs, and fleet-scale management.

Problem

Research questions and friction points this paper is trying to address.

Agent Memory

Long-Horizon Tasks

System Characterization

Stateful Workloads

Memory Systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

agent memory

systems characterization

LLM agents