ExpWeaver: LLM Agents Learn from Experience via Latent RAG

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

Existing explicit text-based experience learning methods suffer from high token overhead and a decoupling between retrieval and generation. This work proposes ExpWeaver, a novel framework that introduces the first end-to-end trainable implicit experience learning mechanism: it encodes experiences into the hidden states of a large language model (LLM) and, at each decoding step, retrieves and integrates relevant experiences in latent space via cross-attention, combined with gated residual connections for efficient fusion. By eliminating standalone RAG modules, ExpWeaver achieves state-of-the-art performance on 12 out of 13 tasks, surpassing the strongest baseline by an average of 6.8%. It matches the token efficiency of non-retrieval methods—explicit approaches require 1.5–2× more tokens—and improves zero-shot and few-shot cross-domain transfer performance by 16.32% and 15.21%, respectively.

📝 Abstract

Experience learning has achieved promising results in enhancing LLM agent planning and reasoning by integrating past interactions as reusable knowledge. However, existing methods remain confined to explicit text space, retrieving experiences via semantic similarity and concatenating them into the context window, leading to substantial token overhead and a decoupled architecture that separates retrieval from generation. To address these limitations, we propose ExpWeaver, a framework that enables LLM agents to learn from experience via latent retrieval-augmented generation, without requiring a separate RAG module. ExpWeaver encodes experiences using the LLM's own hidden states, retrieves relevant experiences directly in latent space at each decoding step, and integrates them through cross-attention aggregation and gated residual mechanisms. The entire pipeline is optimized end-to-end with reinforcement learning, supporting both generative and ranking tasks. We evaluate ExpWeaver on 13 diverse tasks spanning question answering, reasoning, coding, scientific prediction, and recommendation. Results demonstrate that ExpWeaver achieves state-of-the-art performance on 12 out of 13 tasks, outperforming the strongest baseline by over 6.8%; maintains token efficiency comparable to non-retrieval baselines while text-based retrieval methods require 1.5 to 2 times more tokens; and exhibits superior cross-domain generalization, outperforming the strongest baseline by 16.32% under zero-shot transfer and 15.21% under few-shot transfer. Our code for ExpWeaver is released at https://github.com/ulab-uiuc/ExpWeaver.

Problem

Research questions and friction points this paper is trying to address.

experience learning

retrieval-augmented generation

latent space

token overhead

decoupled architecture

Innovation

Methods, ideas, or system contributions that make the work stand out.

latent retrieval-augmented generation

experience learning

LLM agents