🤖 AI Summary
Existing bias detection methods struggle to identify latent political biases that are dynamically amplified during authentic interactions in multi-agent LLM systems—particularly failing within echo-chamber-style discussions on polarized topics. This paper introduces the first quantitative bias assessment framework tailored for multi-agent dialogue systems. It integrates role-initialized LLM pairing simulations, dynamic stance tracking, semantic consistency evaluation, and contrastive bias measurement. Our framework reveals, for the first time, systematic stance drift in LLMs during interaction—for instance, pronounced rightward shifts even when initialized with conservative stances. Empirical analysis across mainstream models (Llama, Claude, GPT) demonstrates consistent amplification of latent political bias; moreover, conventional survey-based methods prove entirely ineffective in interactive settings. By unifying behavioral simulation with fine-grained linguistic and ideological analysis, this work establishes a novel paradigm for bias evaluation in multi-agent LLM systems.
📝 Abstract
Detecting biases in the outputs produced by generative models is essential to reduce the potential risks associated with their application in critical settings. However, the majority of existing methodologies for identifying biases in generated text consider the models in isolation and neglect their contextual applications. Specifically, the biases that may arise in multi-agent systems involving generative models remain under-researched. To address this gap, we present a framework designed to quantify biases within multi-agent systems of conversational Large Language Models (LLMs). Our approach involves simulating small echo chambers, where pairs of LLMs, initialized with aligned perspectives on a polarizing topic, engage in discussions. Contrary to expectations, we observe significant shifts in the stance expressed in the generated messages, particularly within echo chambers where all agents initially express conservative viewpoints, in line with the well-documented political bias of many LLMs toward liberal positions. Crucially, the bias observed in the echo-chamber experiment remains undetected by current state-of-the-art bias detection methods that rely on questionnaires. This highlights a critical need for the development of a more sophisticated toolkit for bias detection and mitigation for AI multi-agent systems. The code to perform the experiments is publicly available at https://anonymous.4open.science/r/LLMsConversationalBias-7725.