🤖 AI Summary
Large language model (LLM)-based agents suffer from insufficient autonomous decision-making in real-time human-agent collaboration due to inherent inference latency and uncertainty in human strategies.
Method: This paper proposes DPT-Agent, the first LLM agent framework integrating Dual-Process Theory (DPT) into language agents. It establishes a synergistic architecture comprising System 1—intuitive, rapid response grounded in finite-state machines and code-as-policy—and System 2—deliberative, slow reasoning incorporating Theory of Mind and asynchronous reflection. The framework decouples perception, decision-making, and execution, enabling dynamic strategy inference without explicit instructions.
Contribution/Results: DPT-Agent significantly outperforms state-of-the-art LLM agents on multi-turn real-time collaborative tasks. It achieves, for the first time, autonomous, synchronous, and latency-aware human-agent co-execution, establishing a novel paradigm for high-temporal-fidelity human–AI symbiosis.
📝 Abstract
Agents built on large language models (LLMs) have excelled in turn-by-turn human-AI collaboration but struggle with simultaneous tasks requiring real-time interaction. Latency issues and the challenge of inferring variable human strategies hinder their ability to make autonomous decisions without explicit instructions. Through experiments with current independent System 1 and System 2 methods, we validate the necessity of using Dual Process Theory (DPT) in real-time tasks. We propose DPT-Agent, a novel language agent framework that integrates System 1 and System 2 for efficient real-time simultaneous human-AI collaboration. DPT-Agent's System 1 uses a Finite-state Machine (FSM) and code-as-policy for fast, intuitive, and controllable decision-making. DPT-Agent's System 2 integrates Theory of Mind (ToM) and asynchronous reflection to infer human intentions and perform reasoning-based autonomous decisions. We demonstrate the effectiveness of DPT-Agent through further experiments with rule-based agents and human collaborators, showing significant improvements over mainstream LLM-based frameworks. To the best of our knowledge, DPT-Agent is the first language agent framework that achieves successful real-time simultaneous human-AI collaboration autonomously. Code of DPT-Agent can be found in https://github.com/sjtu-marl/DPT-Agent.