SID: Multi-LLM Debate Driven by Self Signals

📅 2025-10-08

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Existing multi-LLM agent debate (MAD) methods rely on external structures—such as debate graphs or referee LLMs—while neglecting intrinsic confidence signals inherent in generation, including token-level logits confidence and attention distributions, leading to computational redundancy and performance bottlenecks. This paper proposes SID, a confidence-signal-driven debate framework that, for the first time, integrates model-level confidence and token-level semantic attention into dynamic debate control: it enables high-confidence early exit based on logits confidence and identifies critical semantic units via attention to compress redundant outputs. SID is lightweight, general-purpose, and compatible with diverse large language models and multimodal models. Evaluated on multiple challenging benchmarks, SID reduces token consumption by 32% on average while improving accuracy by +2.1–4.7%, demonstrating that intrinsic confidence signals effectively guide collaborative multi-agent reasoning.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have exhibited impressive capabilities across diverse application domains. Recent work has explored Multi-LLM Agent Debate (MAD) as a way to enhance performance by enabling multiple LLMs to discuss and refine responses iteratively. Nevertheless, existing MAD methods predominantly focus on utilizing external structures, such as debate graphs, using LLM-as-a-Judge, while neglecting the application of self signals, such as token logits and attention, that arise during generation. This omission leads to redundant computation and potential performance degradation. In this paper, we shift the focus to the self signals of multi-LLM debate and introduce a Self-Signals Driven Multi-LLM Debate (SID), which leverages two types of self-signals: model-level confidence and token-level semantic focus, to adaptively guide the debate process. Our approach enables high-confidence agents to exit early at the model level and compress the redundant debate contents based on the attention mechanism. We evaluate our method on various LLMs and Multimodal LLMs across multiple challenging benchmarks. Experimental results demonstrate that our method not only outperforms existing MAD techniques in accuracy but also reduces token consumption, highlighting the effectiveness of utilizing self signals in enhancing both the performance and efficiency of multi-agent debate systems. Our code will be available at~href{https://github.com/xuhang2019/SID}{ exttt{https://github.com/xuhang2019/SID}}.

Problem

Research questions and friction points this paper is trying to address.

Existing multi-LLM debates ignore internal self signals like token logits

This oversight causes redundant computation and potential performance loss

The paper introduces self signals to guide debates adaptively and efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses model-level confidence for early exit

Applies token-level attention to compress content

Leverages self-signals to guide debate process

🔎 Similar Papers

Evaluating the Performance of Large Language Models via Debates