Data Agents Under Attack: Vulnerabilities in LLM-Driven Analytical Systems

📅 2026-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a critical security gap in LLM-driven data agents, which exhibit novel composite vulnerabilities arising from the integration of database access, analytical tools, and multi-step reasoning—threats unaddressed by conventional database or general-purpose LLM security mechanisms. The study proposes the first eight-layer vulnerability model tailored to data agents and introduces a comprehensive attack taxonomy encompassing three objectives, seven tactics, and fourteen techniques. Through layered security modeling, LLM-generated attack payloads, and red-team evaluations grounded in real-world database schemas, the authors conduct a systematic assessment of four open-source agents and two cloud-based services. Their analysis reveals widespread high-severity vulnerabilities and distills four key security practices, thereby establishing the first systematic foundation for securing data agents.
📝 Abstract
Data agents integrate LLM-driven reasoning with relational data access, executable analytical tools, and multi-step workflow orchestration, making them increasingly central to enterprise analytics. This integration introduces new security vulnerabilities across data resources, database execution, and agent reasoning, recombining concerns from database security and general-purpose LLM-agent security into failure modes that neither line of work captures on its own. To address this gap, we present a systematic security study of data agents. Our contributions are threefold. First, we develop a layered vulnerability framework that identifies eight data agent-specific risks across interpretation, execution, and policy layers. Second, we introduce an attack taxonomy organized by adversary goal, tactic, and technique, covering three goals, seven tactics, and fourteen techniques, and pair it with an LLM-driven payload generation pipeline grounded in real database schemas. Third, we evaluate these attacks on six systems, including four open-source data agents and two production cloud analytics services. Our experiments reveal substantial security vulnerabilities across current systems and yield four key takeaways.
Problem

Research questions and friction points this paper is trying to address.

data agents
security vulnerabilities
LLM-driven systems
database security
adversarial attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

data agents
LLM-driven reasoning
security vulnerabilities
attack taxonomy
payload generation
K
Kuncan Wang
Nanyang Technological University, Singapore
Z
Ziting Wang
Nanyang Technological University, Singapore
Peizhuo Lv
Peizhuo Lv
Research Fellow, Nanyang Technological University
AI Security
H
Haoyang Li
The Hong Kong Polytechnic University
Guoliang Li
Guoliang Li
Professor, Tsinghua University
DatabaseBig DataCrowdsourcingData Cleaning & Integration
Gao Cong
Gao Cong
Nanyang Technological University
Data ManagementDatabasesData MiningSpatial Databases
W
Wei Dong
Nanyang Technological University, Singapore