Beyond Similarity: Trustworthy Memory Search for Personal AI Agents

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

232K/year
🤖 AI Summary
Current personal AI agents rely heavily on semantic similarity for long-term memory retrieval, which introduces critical trustworthiness risks such as cross-domain leakage, sycophancy, tool-use misalignment, and memory-induced jailbreaking. To address this, this work proposes MemGate—a lightweight, task-conditioned neural gating mechanism (9M parameters, 35.1MB) that reframes memory retrieval as a trust-aware access control process. Inserted between the vector memory store and the large language model without modifying either component, MemGate enables the first task-intent-based memory filtering approach. Evaluated across mainstream memory frameworks (A-Mem, Mem0, MemOS) and the real-world agent environment OpenClaw, MemGate significantly mitigates memory-induced threats while preserving memory utility, demonstrating strong generalizability and practical deployability.
📝 Abstract
Personal AI agents increasingly rely on long-term memory to provide persistent personalization across sessions. However, existing memory pipelines are largely driven by semantic similarity: memory data close to the current query is retrieved and injected into the model context. This creates a critical trustworthiness gap, since a semantically related memory may still be contextually inappropriate, leading to threats such as cross-domain leakage, sycophancy, tool-call drift, or memory-induced jailbreaks. In this paper, we study memory search as a trust boundary in personal AI agents. We evaluate representative agentic memory frameworks, including A-Mem, Mem0, and MemOS, together with OpenClaw, a real-world personal-agent environment with persistent state and tool-use capability. Our results show that long-term memory is not merely a utility layer, but a durable control channel that can reshape how agents interpret tasks and execute actions, leaving them highly susceptible to the aforementioned threats. To mitigate these vulnerabilities, we propose MemGate, a lightweight and deployable memory plug-in for trustworthy memory search, with only 9M parameters and a 35.1MB footprint. MemGate is inserted between the vector memory store and the backbone LLM, requiring no LLM modification, memory-database rewriting, or inference-time LLM judge. It applies a query-conditioned neural gate to candidate memory representations, turning raw similarity search into task-conditioned memory admission. Across multiple mainstream memory frameworks, real-world agent settings, and diverse LLM backbones, MemGate reduces memory-induced threats while preserving long-term memory utility.
Problem

Research questions and friction points this paper is trying to address.

trustworthiness
memory search
personal AI agents
semantic similarity
contextual appropriateness
Innovation

Methods, ideas, or system contributions that make the work stand out.

trustworthy memory search
MemGate
personal AI agents
memory-induced threats
task-conditioned gating