SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the novel security threats introduced by agentic AI systems—arising from their integration of tool use, retrieval-augmented generation, and multi-agent collaboration—which extend beyond the scope of traditional AI safety concerns. It presents the first systematic characterization of the attack surface and threat model for such systems, establishing a taxonomy encompassing prompt injection, knowledge base poisoning, tool misuse, and cross-agent manipulation. The study further introduces quantifiable security metrics, including unsafe operation rate and privilege escalation distance. Through a systematic review of over 20 studies from 2023 to 2025, it evaluates the efficacy of existing defenses such as input sanitization, sandboxing, and access control, exposing critical limitations. Building on these insights, the paper proposes a holistic, full-lifecycle security framework spanning design, runtime, and incident response, accompanied by a phased deployment roadmap.

Technology Category

Application Category

📝 Abstract

Recent AI systems combine large language models with tools, external knowledge via retrieval-augmented generation (RAG), and even autonomous multi-agent decision loops. This agentic AI paradigm greatly expands capabilities - but also vastly enlarges the attack surface. In this systematization, we map out the trust boundaries and security risks of agentic LLM-based systems. We develop a comprehensive taxonomy of attacks spanning prompt-level injections, knowledge-base poisoning, tool/plug-in exploits, and multi-agent emergent threats. Through a detailed literature review, we synthesize evidence from 2023-2025, including more than 20 peer-reviewed and archival studies, industry reports, and standards. We find that agentic systems introduce new vectors for indirect prompt injection, code execution exploits, RAG index poisoning, and cross-agent manipulation that go beyond traditional AI threats. We define attacker models and threat scenarios, and propose metrics (e.g., Unsafe Action Rate, Privilege Escalation Distance) to evaluate security posture. Our survey examines defenses such as input sanitization, retrieval filters, sandboxes, access control, and "AI guardrails," assessing their effectiveness and pointing out the areas where protection is still lacking. To assist practitioners, we outline defensive controls and provide a phased security checklist for deploying agentic AI (covering design-time hardening, runtime monitoring, and incident response). Finally, we outline open research challenges in secure autonomous AI (robust tool APIs, verifiable agent behavior, supply-chain safeguards) and discuss ethical and responsible disclosure practices. We systematize recent findings to help researchers and engineers understand and mitigate security risks in agentic AI.

Problem

Research questions and friction points this paper is trying to address.

Agentic AI

Attack Surface

Security Risks

LLM-based Systems

Autonomous Agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic AI

Attack Surface

Prompt Injection