From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium

πŸ“… 2025-06-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Multi-LLM collaborative reasoning often suffers from high computational overhead and lack of convergence guarantees. This paper pioneers modeling multi-agent LLM coordination as an incomplete-information Bayesian game, grounded theoretically in Bayesian Nash Equilibrium (BNE), and proposes ECONβ€”a hierarchical reinforcement learning framework that achieves provable convergence and a tight regret bound without requiring frequent inter-agent communication. ECON integrates probabilistic belief modeling, distributed policy optimization, and Bayesian game analysis, enabling flexible integration of heterogeneous LLMs. Evaluated on six challenging reasoning and planning benchmarks, ECON achieves an average improvement of 11.2% over state-of-the-art collaborative paradigms. The implementation is publicly available.

Technology Category

Application Category

πŸ“ Abstract
Multi-agent frameworks can substantially boost the reasoning power of large language models (LLMs), but they typically incur heavy computational costs and lack convergence guarantees. To overcome these challenges, we recast multi-LLM coordination as an incomplete-information game and seek a Bayesian Nash equilibrium (BNE), in which each agent optimally responds to its probabilistic beliefs about the strategies of others. We introduce Efficient Coordination via Nash Equilibrium (ECON), a hierarchical reinforcement-learning paradigm that marries distributed reasoning with centralized final output. Under ECON, each LLM independently selects responses that maximize its expected reward, conditioned on its beliefs about co-agents, without requiring costly inter-agent exchanges. We mathematically prove that ECON attains a markedly tighter regret bound than non-equilibrium multi-agent schemes. Empirically, ECON outperforms existing multi-LLM approaches by 11.2% on average across six benchmarks spanning complex reasoning and planning tasks. Further experiments demonstrate ECON's ability to flexibly incorporate additional models, confirming its scalability and paving the way toward larger, more powerful multi-LLM ensembles. The code is publicly available at: https://github.com/tmlr-group/ECON.
Problem

Research questions and friction points this paper is trying to address.

Overcome high computational costs in multi-agent LLM frameworks
Ensure convergence in multi-agent LLM reasoning without guarantees
Enhance reasoning power via Bayesian Nash Equilibrium coordination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian Nash equilibrium for multi-agent coordination
Hierarchical reinforcement learning with distributed reasoning
Tighter regret bound than non-equilibrium approaches
πŸ”Ž Similar Papers
No similar papers found.
X
Xie Yi
Academy for Engineering and Technology, Fudan University
Zhanke Zhou
Zhanke Zhou
PhD@HKBU, Visiting@Stanford University
Machine LearningMachine Reasoning
Chentao Cao
Chentao Cao
Ph.D. student, HKBU
Machine LearningMachine Reasoning
Q
Qiyu Niu
Academy for Engineering and Technology, Fudan University
Tongliang Liu
Tongliang Liu
Director, Sydney AI Centre, University of Sydney & Mohamed bin Zayed University of AI
Machine LearningLearning with Noisy LabelsTrustworthy Machine Learning
B
Bo Han
TMLR Group, Department of Computer Science, Hong Kong Baptist University