Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System

📅 2024-10-10

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

To address core limitations of LLM-based multi-agent systems (MAS)—including inefficient inter-agent communication, poor scalability, and the absence of end-to-end parameter optimization—this paper proposes Optima, a holistic framework that establishes a generate-rank-select-train closed loop to jointly optimize task performance, token efficiency, and dialogue readability. We introduce a novel multi-objective reward function and an MCTS-inspired DPO data generation method that models multi-turn dialogues as tree structures to explore diverse collaborative trajectories. Optima employs hybrid SFT-DPO training and a cooperative inference paradigm. Evaluated on information-asymmetric question answering and complex reasoning tasks, Optima achieves up to a 2.8× performance gain over the Llama-3-8B baseline, reduces token consumption by over 90%, and significantly improves inference-time scaling behavior.

Technology Category

Application Category

📝 Abstract

Large Language Model (LLM) based multi-agent systems (MAS) show remarkable potential in collaborative problem-solving, yet they still face critical challenges: low communication efficiency, poor scalability, and a lack of effective parameter-updating optimization methods. We present Optima, a novel framework that addresses these issues by significantly enhancing both communication efficiency and task effectiveness in LLM-based MAS through LLM training. Optima employs an iterative generate, rank, select, and train paradigm with a reward function balancing task performance, token efficiency, and communication readability. We explore various RL algorithms, including Supervised Fine-Tuning, Direct Preference Optimization, and their hybrid approaches, providing insights into their effectiveness-efficiency trade-offs. We integrate Monte Carlo Tree Search-inspired techniques for DPO data generation, treating conversation turns as tree nodes to explore diverse interaction paths. Evaluated on common multi-agent tasks, including information-asymmetric question answering and complex reasoning, Optima shows consistent and substantial improvements over single-agent baselines and vanilla MAS based on Llama 3 8B, achieving up to 2.8x performance gain with less than 10% tokens on tasks requiring heavy information exchange. Moreover, Optima's efficiency gains open new possibilities for leveraging inference-compute more effectively, leading to improved inference-time scaling laws. By addressing fundamental challenges in LLM-based MAS, Optima shows the potential towards scalable, efficient, and effective MAS (https://chenweize1998.github.io/optima-project-page).

Problem

Research questions and friction points this paper is trying to address.

Enhances communication efficiency in LLM-based multi-agent systems.

Improves task effectiveness through optimized parameter-updating methods.

Explores efficiency-effectiveness trade-offs in various RL algorithms.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative generate-rank-select-train paradigm

Reward function balances task and efficiency

Monte Carlo Tree Search for DPO data

🔎 Similar Papers

Adaptive In-conversation Team Building for Language Model Agents

2024-05-29arXiv.orgCitations: 5

MegaAgent: A Large-Scale Autonomous LLM-based Multi-Agent System Without Predefined SOPs

2024-08-19Citations: 0

Authors to Follow