🤖 AI Summary
This work addresses a critical security gap in LLM-driven data agents, which exhibit novel composite vulnerabilities arising from the integration of database access, analytical tools, and multi-step reasoning—threats unaddressed by conventional database or general-purpose LLM security mechanisms. The study proposes the first eight-layer vulnerability model tailored to data agents and introduces a comprehensive attack taxonomy encompassing three objectives, seven tactics, and fourteen techniques. Through layered security modeling, LLM-generated attack payloads, and red-team evaluations grounded in real-world database schemas, the authors conduct a systematic assessment of four open-source agents and two cloud-based services. Their analysis reveals widespread high-severity vulnerabilities and distills four key security practices, thereby establishing the first systematic foundation for securing data agents.
📝 Abstract
Data agents integrate LLM-driven reasoning with relational data access, executable analytical tools, and multi-step workflow orchestration, making them increasingly central to enterprise analytics. This integration introduces new security vulnerabilities across data resources, database execution, and agent reasoning, recombining concerns from database security and general-purpose LLM-agent security into failure modes that neither line of work captures on its own. To address this gap, we present a systematic security study of data agents. Our contributions are threefold. First, we develop a layered vulnerability framework that identifies eight data agent-specific risks across interpretation, execution, and policy layers. Second, we introduce an attack taxonomy organized by adversary goal, tactic, and technique, covering three goals, seven tactics, and fourteen techniques, and pair it with an LLM-driven payload generation pipeline grounded in real database schemas. Third, we evaluate these attacks on six systems, including four open-source data agents and two production cloud analytics services. Our experiments reveal substantial security vulnerabilities across current systems and yield four key takeaways.