🤖 AI Summary
Current service robots exhibit limited natural language understanding, rely heavily on predefined commands, and lack proactive collaboration awareness, hindering adaptability in dynamic office environments. To address this, we propose PPDR4X—a multi-agent architecture integrating the LLaMA-3 large language model (LLM), embodied memory retrieval, collaborative intent recognition, and an active求助 mechanism—enabling the first LLM-driven, proactive intelligent assistant for real-world office settings. The system establishes a physical-virtual closed-loop, supporting end-to-end autonomous collaborative task planning and execution. In live office deployments, it achieves a 92% success rate on complex collaborative tasks, sub-3.1-second average response latency, and 87% accuracy in initiating proactive collaboration requests—substantially outperforming baseline approaches. Our core contribution is the first deep integration of LLMs into the embodied robot collaboration loop, shifting from passive responsiveness to proactive human-robot coexistence and establishing a novel paradigm for active, context-aware assistance.
📝 Abstract
The increasing demand for intelligent assistants in human-populated environments has motivated significant research in autonomous robotic systems. Traditional service robots and virtual assistants, however, struggle with real-world task execution due to their limited capacity for dynamic reasoning and interaction, particularly when human collaboration is required. Recent developments in Large Language Models have opened new avenues for improving these systems, enabling more sophisticated reasoning and natural interaction capabilities. In this paper, we introduce AssistantX, an LLM-powered proactive assistant designed to operate autonomously in a physical office environment. Unlike conventional service robots, AssistantX leverages a novel multi-agent architecture, PPDR4X, which provides advanced inference capabilities and comprehensive collaboration awareness. By effectively bridging the gap between virtual operations and physical interactions, AssistantX demonstrates robust performance in managing complex real-world scenarios. Our evaluation highlights the architecture's effectiveness, showing that AssistantX can respond to clear instructions, actively retrieve supplementary information from memory, and proactively seek collaboration from team members to ensure successful task completion. More details and videos can be found at https://assistantx-agent.github.io/AssistantX/.