AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work proposes AgentArk, a novel framework that addresses the deployment bottlenecks of multi-agent systems—such as high computational overhead and error propagation—by distilling their collaborative reasoning capabilities into a single large language model (LLM) during training. For the first time, this approach endows a solitary LLM with implicit multi-agent-like reasoning and self-correction abilities without requiring runtime agent interaction. The method employs a three-tiered distillation strategy: reasoning-augmented fine-tuning, trajectory-augmented data construction, and process-aware distillation, trained across diverse models and tasks. Experimental results demonstrate that the distilled single-agent model achieves performance on complex tasks comparable to, or even surpassing, that of the original multi-agent system, while maintaining inference efficiency and exhibiting enhanced robustness and generalization.

Technology Category

Application Category

📝 Abstract

While large language model (LLM) multi-agent systems achieve superior reasoning performance through iterative debate, practical deployment is limited by their high computational cost and error propagation. This paper proposes AgentArk, a novel framework to distill multi-agent dynamics into the weights of a single model, effectively transforming explicit test-time interactions into implicit model capabilities. This equips a single agent with the intelligence of multi-agent systems while remaining computationally efficient. Specifically, we investigate three hierarchical distillation strategies across various models, tasks, scaling, and scenarios: reasoning-enhanced fine-tuning; trajectory-based augmentation; and process-aware distillation. By shifting the burden of computation from inference to training, the distilled models preserve the efficiency of one agent while exhibiting strong reasoning and self-correction performance of multiple agents. They further demonstrate enhanced robustness and generalization across diverse reasoning tasks. We hope this work can shed light on future research on efficient and robust multi-agent development. Our code is at https://github.com/AIFrontierLab/AgentArk.

Problem

Research questions and friction points this paper is trying to address.

multi-agent systems

computational cost

error propagation

large language models

reasoning performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent distillation

single LLM agent

reasoning enhancement