Rosetta Memory: Adaptive Memory for Cross-LLM Agents

📅 2026-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that memories written by upstream large language models (LLMs) are often poorly utilized by downstream models in multi-LLM switching scenarios. To this end, the authors propose a memory-centric LLM adaptation framework that jointly trains conditional operators for memory writing and reading to optimize both the storage and presentation of memory contents. The framework further incorporates a minimum-gain sampling curriculum and a performance-gap-based reward mechanism to enhance cross-model task performance. Experimental results demonstrate that the proposed method significantly outperforms baseline approaches on HotpotQA, 2WikiMultihopQA, and MuSiQue benchmarks, while exhibiting strong generalization and robustness under unseen LLM substitutions.
📝 Abstract
Memory is the key component for transforming a stateless LLM into a persistent, evolving agent through experience accumulation, long-horizon planning, and continual self-improvement. Existing memory systems typically take the LLM as the center and design memory operations tailored to a specific backbone. In practice, however, users frequently switch between LLMs, for example using Claude for coding and GPT for writing across tasks, or routing different steps to different backbones within a single task for cost-effective trade-offs. As a result, memory written by one model often needs to be consumed by another. Making upstream memory effectively adapt to and activate downstream LLMs remains a critical yet underexplored problem. To bridge this gap, we shift the perspective from LLM-centric memory design to \emph{memory-centric LLM adaptation}. Specifically, we approach the above upstream-downstream memory adaptation problem from both the write and read sides, and design two profile-conditioned operators that are jointly trained to optimize how memory is stored and presented for better task completion. To ensure the learned operators generalize across a broad set of LLMs, we propose a minimum-gain sampling curriculum that prioritizes the least-served LLMs during training. To better measure the operators' actual contribution rather than the LLM's own capability, we design a performance-gap reward that compares against a naive memory baseline. Experiments on HotpotQA, 2WikiMultihopQA, and MuSiQue demonstrate that our model consistently outperforms baselines and remains robust under unseen-model replacement.
Problem

Research questions and friction points this paper is trying to address.

cross-LLM memory
memory adaptation
LLM interoperability
adaptive memory
memory transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

memory-centric adaptation
cross-LLM memory
profile-conditioned operators
minimum-gain sampling
performance-gap reward
🔎 Similar Papers
No similar papers found.