🤖 AI Summary
Existing deep research systems rely on handcrafted prompts and static architectures, rendering their optimization processes fragile, costly, and inefficient. This work proposes a self-optimizing multi-agent system that autonomously enhances its planning, retrieval, and synthesis capabilities under complex information needs through self-play training and dynamic prompt ensemble search. By eliminating manual prompt engineering, the system achieves performance autoregression—continuously evolving its own effectiveness without human intervention. Experimental results demonstrate that the proposed approach matches or surpasses the performance of expert-designed prompts on deep research tasks, substantially improving both answer quality and development efficiency.
📝 Abstract
Given a user's complex information need, a multi-agent Deep Research system iteratively plans, retrieves, and synthesizes evidence across hundreds of documents to produce a high-quality answer. In one possible architecture, an orchestrator agent coordinates the process, while parallel worker agents execute tasks. Current Deep Research systems, however, often rely on hand-engineered prompts and static architectures, making improvement brittle, expensive, and time-consuming. We therefore explore various multi-agent optimization methods to show that enabling agents to self-play and explore different prompt combinations can produce high-quality Deep Research systems that match or outperform expert-crafted prompts.