G-Long: Graph-Enhanced Memory Management for Efficient Long-Term Dialogue Agents

📅 2026-06-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of maintaining consistency in long-term dialogues, where large language models often struggle due to limited contextual reasoning capabilities and inefficient processing of raw text. To overcome these limitations, the authors propose a graph-augmented memory management framework that constructs a knowledge graph by fine-tuning a small language model to extract structured triplets from dialogue history. They further introduce an attention-aware importance scoring mechanism, leveraging cross-attention signals from a T5-based summarization model to prioritize and retain critical memories. This approach significantly enhances both memory retrieval and response generation while reducing computational overhead. Experimental results demonstrate a 9.8% improvement in response quality on the MSC dataset and a 40.8% increase in retrieval recall on LME, outperforming current state-of-the-art methods.

📝 Abstract

While Large Language Models (LLMs) have advanced open-domain dialogue systems, maintaining long-term consistency remains a challenge due to inherent limitations in long-context reasoning and the inefficiency of processing extensive raw text. Existing approaches typically rely on either unstructured memory storage, which is prone to information loss, or computationally expensive LLMs that incur high latency. To address these limitations, we propose G-Long, a graph-enhanced framework that utilizes a fine-tuned small Language Model (sLM) for structured triplet extraction and associative retrieval, significantly reducing operational costs. Furthermore, we introduce the novel attention-aware importance scoring mechanism that leverages the intrinsic cross-attention signals of a T5 summarizer to identify salient memories. Extensive experiments across diverse benchmarks demonstrate that G-Long achieves state-of-the-art performance in both response generation and memory retrieval, yielding performance gains of up to 9.8% in response quality on MSC and 40.8% in retrieval recall on LME, while significantly minimizing computational overhead.

Problem

Research questions and friction points this paper is trying to address.

long-term dialogue

memory management

consistency

large language models

efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

graph-enhanced memory

structured triplet extraction

attention-aware scoring