Where Did It All Go Wrong? A Hierarchical Look into Multi-Agent Error Attribution

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

In multi-agent systems, LLM-based error attribution suffers from inaccurate agent- and step-level fault localization and inconsistent results. To address these challenges, we propose ECHO—a novel error attribution algorithm featuring three key innovations: (1) a hierarchical context structure that disentangles interaction trajectories by positional encoding; (2) a goal-consistency evaluation criterion that models task completion rather than mere output correctness; and (3) a multi-agent consensus voting mechanism to mitigate bias under strongly coupled interactions. Extensive experiments across diverse complex multi-agent reasoning tasks demonstrate that ECHO significantly outperforms baselines—including one-shot evaluation, stepwise analysis, and binary search—achieving 27.4%–41.8% absolute gains in attribution accuracy for fine-grained reasoning errors and high-dependency scenarios. Moreover, ECHO exhibits strong cross-task generalization capability, validating its robustness and scalability in realistic multi-agent settings.

Technology Category

Application Category

📝 Abstract

Error attribution in Large Language Model (LLM) multi-agent systems presents a significant challenge in debugging and improving collaborative AI systems. Current approaches to pinpointing agent and step level failures in interaction traces - whether using all-at-once evaluation, step-by-step analysis, or binary search - fall short when analyzing complex patterns, struggling with both accuracy and consistency. We present ECHO (Error attribution through Contextual Hierarchy and Objective consensus analysis), a novel algorithm that combines hierarchical context representation, objective analysis-based evaluation, and consensus voting to improve error attribution accuracy. Our approach leverages a positional-based leveling of contextual understanding while maintaining objective evaluation criteria, ultimately reaching conclusions through a consensus mechanism. Experimental results demonstrate that ECHO outperforms existing methods across various multi-agent interaction scenarios, showing particular strength in cases involving subtle reasoning errors and complex interdependencies. Our findings suggest that leveraging these concepts of structured, hierarchical context representation combined with consensus-based objective decision-making, provides a more robust framework for error attribution in multi-agent systems.

Problem

Research questions and friction points this paper is trying to address.

Pinpointing agent and step failures in multi-agent LLM systems

Improving accuracy and consistency in error attribution analysis

Addressing complex reasoning errors and interdependencies in interactions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical context representation for error analysis

Objective evaluation criteria with consensus voting

Structured framework for multi-agent error attribution

🔎 Similar Papers

No similar papers found.