Byzantine Cheap Talk: Adversarial Resilience and Topology Effects in LLM Coordination Games

📅 2026-06-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates the robustness of coordination among multi-agent large language models (LLMs) in a four-player Stag Hunt game through cheap talk, with a focus on the effects of Byzantine attacks and communication topology constraints. Drawing on 720 experiments across six model types, and integrating cheap talk protocols, Byzantine fault tolerance analysis, and topological control, the work demonstrates that coordination failures stem primarily from agents’ meta-reasoning biases about hidden information rather than mere informational deficits. Two cross-model behavioral archetypes consistently emerge: betrayal-prone and cooperation-persistent. The presence of Byzantine agents can trigger sustained exploitation, from which groups struggle to recover collaborative equilibrium. Moreover, explicit communication restrictions substantially reduce cooperation rates, whereas implicit constraints exert negligible impact, revealing a latent vulnerability in current LLM-based multi-agent systems.

📝 Abstract

Multi-agent LLM systems increasingly rely on communication protocols for coordination, yet their robustness under adversarial and structural constraints remains poorly understood. Building on prior work showing that cheap-talk channels enable cooperation in LLM coordination games, we investigate two vulnerability classes in a 4-player Stag Hunt across six model families and 720 trials. First, when Byzantine agents signal cooperation but defect, non-Byzantine agents detect the betrayal within one round yet fail to adapt collectively: a substantial fraction continue cooperating despite repeated exploitation, unable to recover coordination due to the game's unanimity payoff structure. Second, explicitly restricting communication topology collapses cooperation, while applying identical restrictions silently preserves near-perfect cooperation. This establishes that coordination failure stems from agents' meta-reasoning about hidden information, not information loss itself. We identify two stable behavioral archetypes that replicate across all model cohorts: Defection-Prone models that switch permanently after betrayal, and Cooperation-Persistent models that continue cooperating at significant individual cost. These findings reveal concrete security vulnerabilities: communication channels can be exploited as adversarial injection vectors, and disclosing network topology to agents can degrade coordination even without any adversary present.

Problem

Research questions and friction points this paper is trying to address.

Byzantine agents

LLM coordination

cheap talk

communication topology

adversarial resilience

Innovation

Methods, ideas, or system contributions that make the work stand out.

Byzantine resilience

cheap talk

LLM coordination