๐ค AI Summary
This work addresses key challenges in long-term memory modeling for dialogue systemsโnamely, poor adaptability, high computational overhead, and limited generalization across diverse queries. Drawing on event segmentation theory, the authors propose an approach that automatically detects event boundaries by monitoring shifts in four core elements: characters, time, location, and topic. This enables the construction of an interpretable hyperedge-based index structure, complemented by a boundary-guided memory writing mechanism and a query-adaptive retrieval strategy that dynamically adjusts both the content and depth of retrieval. Notably, the method requires no predefined query categories and balances interpretability, efficiency, and generalization, making it compatible with large language models of varying scales. Evaluated on the LOCOMO dataset, it achieves approximately a 20% improvement in question-answering accuracy over strong baselines while reducing token consumption by 68%, substantially outperforming existing approaches such as HippoRAG2.
๐ Abstract
Long-term memory is critical for dialogue systems that support continuous, sustainable, and personalized interactions. However, existing methods rely on continuous summarization or OpenIE-based graph construction paired with fixed Top-\textit{k} retrieval, leading to limited adaptability across query categories and high computational overhead. In this paper, we propose HingeMem, a boundary-guided long-term memory that operationalizes event segmentation theory to build an interpretable indexing interface via boundary-triggered hyperedges over four elements: person, time, location, and topic. When any such element changes, HingeMem draws a boundary and writes the current segment, thereby reducing redundant operations and preserving salient context. To enable robust and efficient retrieval under diverse information needs, HingeMem introduces query-adaptive retrieval mechanisms that jointly decide (a) \textit{what to retrieve}: determine the query-conditioned routing over the element-indexed memory; (b) \textit{how much to retrieve}: control the retrieval depth based on the estimated query type. Extensive experiments across LLM scales (from 0.6B to production-tier models; \textit{e.g.}, Qwen3-0.6B to Qwen-Flash) on LOCOMO show that HingeMem achieves approximately $20\%$ relative improvement over strong baselines without query categories specification, while reducing computational cost (68\%$\downarrow$ question answering token cost compared to HippoRAG2). Beyond advancing memory modeling, HingeMem's adaptive retrieval makes it a strong fit for web applications requiring efficient and trustworthy memory over extended interactions.