Preference-Aware Memory Update for Long-Term LLM Agents

📅 2025-10-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM-based agents lack mechanisms to model the dynamic evolution of user preferences in long-term memory, particularly failing to balance short-term fluctuations against long-term tendencies during memory updates. To address this, we propose Preference-Aware Memory Updating (PAMU), a novel memory update mechanism that— for the first time—integrates sliding-window averaging with exponential weighted moving averaging, coupled with dense vector representations and similarity-based retrieval. PAMU enables fine-grained, adaptive updating of user preference memories without requiring additional training and can be seamlessly integrated into any LLM agent. Evaluated on five long-horizon interactive tasks from the LoCoMo benchmark, PAMU consistently improves generation quality across five diverse baseline models, demonstrating its effectiveness in sustaining response consistency and personalization throughout extended dialogues.

Technology Category

Application Category

📝 Abstract
One of the key factors influencing the reasoning capabilities of LLM-based agents is their ability to leverage long-term memory. Integrating long-term memory mechanisms allows agents to make informed decisions grounded in historical interactions. While recent advances have significantly improved the storage and retrieval components, by encoding memory into dense vectors for similarity search or organizing memory as structured knowledge graphs most existing approaches fall short in memory updating. In particular, they lack mechanisms for dynamically refining preference memory representations in response to evolving user behaviors and contexts. To address this gap, we propose a Preference-Aware Memory Update Mechanism (PAMU) that enables dynamic and personalized memory refinement. By integrating sliding window averages (SW) with exponential moving averages (EMA), PAMU constructs a fused preference-aware representation that captures both short-term fluctuations and long-term user tendencies. We conduct experiments on five task scenarios of the LoCoMo dataset, and the results show that our mechanism can significantly improve the output quality of LLM in five baselines, validating its effectiveness in long-term conversations.
Problem

Research questions and friction points this paper is trying to address.

Dynamic memory updating for long-term LLM agents
Refining preference memory with evolving user behaviors
Capturing short-term fluctuations and long-term user tendencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic preference-aware memory update mechanism
Integrates sliding window and exponential moving averages
Captures short-term fluctuations and long-term user tendencies
🔎 Similar Papers
No similar papers found.