🤖 AI Summary
LLM-driven multi-agent systems (MAS) exhibit critical security vulnerabilities—including susceptibility to adversarial attacks, misinformation propagation, and unintended behaviors—hindering their deployment in safety-critical applications.
Method: This paper proposes a topology-aware proactive defense framework that models inter-agent dialogues as directed graphs, integrating graph neural networks with MAS interaction topology to enable interpretable anomaly detection and structured intervention. It introduces a novel topology-guided intervention mechanism and prompt-injection robustness enhancement technique, designed for cross-architecture compatibility and plug-and-play integration.
Contribution/Results: Extensive experiments demonstrate that the framework achieves over 40% average performance recovery across diverse attack types. It maintains full compatibility with mainstream LLMs and large-scale MAS deployments, significantly improving both system security assurance and operational deployability without compromising functional integrity.
📝 Abstract
Large Language Model (LLM)-based Multi-agent Systems (MAS) have demonstrated remarkable capabilities in various complex tasks, ranging from collaborative problem-solving to autonomous decision-making. However, as these systems become increasingly integrated into critical applications, their vulnerability to adversarial attacks, misinformation propagation, and unintended behaviors have raised significant concerns. To address this challenge, we introduce G-Safeguard, a topology-guided security lens and treatment for robust LLM-MAS, which leverages graph neural networks to detect anomalies on the multi-agent utterance graph and employ topological intervention for attack remediation. Extensive experiments demonstrate that G-Safeguard: (I) exhibits significant effectiveness under various attack strategies, recovering over 40% of the performance for prompt injection; (II) is highly adaptable to diverse LLM backbones and large-scale MAS; (III) can seamlessly combine with mainstream MAS with security guarantees. The code is available at https://github.com/wslong20/G-safeguard.