From Debate to Decision: Conformal Social Choice for Safe Multi-Agent Deliberation

📅 2026-04-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the risk of irreversible errors in multi-agent debate systems driven by large language models, where erroneous consensus can compromise decision reliability. To mitigate this, the authors propose a novel “conformal social choice” framework that integrates conformal prediction into the post-debate decision layer. The approach aggregates heterogeneous agents’ probabilistic outputs via linear opinion pooling and employs split conformal prediction to construct prediction sets with guaranteed marginal coverage. A hierarchical action policy is introduced to determine whether to execute decisions automatically or escalate to human review. Evaluated on MMLU-Pro across eight domains, the method achieves target coverage within ±1–2% at significance level α=0.05, intercepts 81.9% of erroneous consensus cases, and attains single-element prediction set accuracies of 90.0–96.8%, substantially outperforming conventional consensus mechanisms while effectively balancing safety and automation.
📝 Abstract
Multi-agent debate improves LLM reasoning, yet agreement among agents is not evidence of correctness. When agents converge on a wrong answer through social reinforcement, consensus-based stopping commits that error to an automated action with no recourse. We introduce Conformal Social Choice, a post-hoc decision layer that converts debate outputs into calibrated act-versus-escalate decisions. Verbalized probability distributions from heterogeneous agents are aggregated via a linear opinion pool and calibrated with split conformal prediction, yielding prediction sets with a marginal coverage guarantee: the correct answer is included with probability ${\geq}\,1{-}α$, without assumptions on individual model calibration. A hierarchical action policy maps singleton sets to autonomous action and larger sets to human escalation. On eight MMLU-Pro domains with three agents (Claude Haiku, DeepSeek-R1, Qwen-3 32B), coverage stays within 1--2 points of the target. The key finding is not that debate becomes more accurate, but that the conformal layer makes its failures actionable: 81.9% of wrong-consensus cases are intercepted at $α{=}0.05$. Because the layer refuses to act on cases where debate is confidently wrong, the remaining conformal singletons reach 90.0--96.8% accuracy (up to 22.1pp above consensus stopping) -- a selection effect, not a reasoning improvement. This safety comes at the cost of automation, but the operating point is user-adjustable via $α$.
Problem

Research questions and friction points this paper is trying to address.

multi-agent debate
consensus error
safe automation
actionable failure
calibrated decision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal Social Choice
multi-agent debate
conformal prediction
calibrated decision-making
human escalation
🔎 Similar Papers
No similar papers found.
M
Mengdie Flora Wang
AWS Generative AI Innovation Center
H
Haochen Xie
AWS Generative AI Innovation Center
G
Guanghui Wang
AWS Generative AI Innovation Center
A
Aijing Gao
AWS Generative AI Innovation Center
Guang Yang
Guang Yang
Applied Scientist, Amazon Web Services
geostatisticsuncertainty quantificationsensitivity analysisbayesian methods
Ziyuan Li
Ziyuan Li
Associate Professor, School of Optics and Photonics, Beijing Institute of Technology
Optoelectronicssemiconductornanowireplasmonicsoptical antennas
Q
Qucy Wei Qiu
HSBC Holdings Plc., HSBC Technology Center, China
F
Fangwei Han
HSBC Holdings Plc., HSBC Technology Center, China
H
Hengzhi Qiu
HSBC Holdings Plc., HSBC Technology Center, China
Y
Yajing Huang
HSBC Holdings Plc., HSBC Technology Center, China
B
Bing Zhu
HSBC Holdings Plc., HSBC Technology Center, China
Jae Oh Woo
Jae Oh Woo
GenAI Innovation Center - Amazon Web Services (AWS)
Data ScienceMathematical Foundations of Deep LearningInformation TheoryStochastic Geometry