🤖 AI Summary
Existing LLM-based agents lack post-deployment self-evolution, relying heavily on frequent retraining—entailing prohibitive computational costs—and face an inherent trade-off between accuracy and reasoning efficiency. This paper proposes MOBIMEM, a memory-centric agent system that decouples agent evolution from model weights via three specialized, modular memory primitives: Profile, Experience, and Action. It integrates OS-level services—including a lightweight scheduler, action logging/replay, and context-aware anomaly recovery—to enable safe, autonomous evolution. Key innovations include DisGraph indexing for efficient memory retrieval, multi-level templated execution logic, fine-grained action sequence modeling, and the AgentRR mechanism for robust runtime adaptation. Experiments show MOBIMEM achieves 83.1% Profile alignment, reduces retrieval latency to 23.83 ms (280× faster than GraphRAG), improves task success rate by up to 50.3%, and cuts end-to-end latency to one-ninth of prior approaches.
📝 Abstract
Large Language Model (LLM) agents are increasingly deployed to automate complex workflows in mobile and desktop environments. However, current model-centric agent architectures struggle to self-evolve post-deployment: improving personalization, capability, and efficiency typically requires continuous model retraining/fine-tuning, which incurs prohibitive computational overheads and suffers from an inherent trade-off between model accuracy and inference efficiency.
To enable iterative self-evolution without model retraining, we propose MOBIMEM, a memory-centric agent system. MOBIMEM first introduces three specialized memory primitives to decouple agent evolution from model weights: (1) Profile Memory uses a lightweight distance-graph (DisGraph) structure to align with user preferences, resolving the accuracy-latency trade-off in user profile retrieval; (2) Experience Memory employs multi-level templates to instantiate execution logic for new tasks, ensuring capability generalization; and (3) Action Memory records fine-grained interaction sequences, reducing the reliance on expensive model inference. Building upon this memory architecture, MOBIMEM further integrates a suite of OS-inspired services to orchestrate execution: a scheduler that coordinates parallel sub-task execution and memory operations; an agent record-and-replay (AgentRR) mechanism that enables safe and efficient action reuse; and a context-aware exception handling that ensures graceful recovery from user interruptions and runtime errors.
Evaluation on AndroidWorld and top-50 apps shows that MOBIMEM achieves 83.1% profile alignment with 23.83 ms retrieval time (280x faster than GraphRAG baselines), improves task success rates by up to 50.3%, and reduces end-to-end latency by up to 9x on mobile devices.