🤖 AI Summary
Addressing critical challenges in proactive behavior, privacy preservation, and self-evolution of large language model–based intelligent personal assistants (IPAs), this paper proposes Galaxy—a novel framework. Methodologically, it introduces a cognition-oriented semantic architecture, the Cognition Forest, unifying cognitive modeling with system design; establishes a dual-agent collaboration mechanism—where the generative agent KoRa enables proactive service delivery and responsive interaction, while the meta-agent Kernel incorporates metacognitive capabilities to support on-device self-evolution and privacy-preserving inference; and implements end-to-end optimization for real-world deployment. Experimental evaluation across multiple benchmarks demonstrates that Galaxy significantly outperforms state-of-the-art IPA approaches. Ablation studies validate the individual contributions of each component, and real-world user interactions confirm its practical efficacy, robustness, and usability in privacy-sensitive scenarios.
📝 Abstract
Intelligent personal assistants (IPAs) such as Siri and Google Assistant are designed to enhance human capabilities and perform tasks on behalf of users. The emergence of LLM agents brings new opportunities for the development of IPAs. While responsive capabilities have been widely studied, proactive behaviors remain underexplored. Designing an IPA that is proactive, privacy-preserving, and capable of self-evolution remains a significant challenge. Designing such IPAs relies on the cognitive architecture of LLM agents. This work proposes Cognition Forest, a semantic structure designed to align cognitive modeling with system-level design. We unify cognitive architecture and system design into a self-reinforcing loop instead of treating them separately. Based on this principle, we present Galaxy, a framework that supports multidimensional interactions and personalized capability generation. Two cooperative agents are implemented based on Galaxy: KoRa, a cognition-enhanced generative agent that supports both responsive and proactive skills; and Kernel, a meta-cognition-based meta-agent that enables Galaxy's self-evolution and privacy preservation. Experimental results show that Galaxy outperforms multiple state-of-the-art benchmarks. Ablation studies and real-world interaction cases validate the effectiveness of Galaxy.