Adversarial threat vectors and risk mitigation for retrieval-augmented generation systems

📅 2025-05-28
🏛️ Assurance and Security for AI-enabled Systems 2025
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
RAG systems face three critical security threats: prompt injection, data poisoning, and adversarial query manipulation. To address these, we propose the first structured threat taxonomy specifically designed for RAG, coupled with a risk-quantification–driven defense prioritization mechanism—advancing RAG security from empirical practice to measurable assurance. Methodologically, we integrate input validation, adversarial training, real-time monitoring, and knowledge provenance into an end-to-end RAG security enhancement architecture. Evaluations on mainstream RAG benchmarks demonstrate an average 76.3% reduction in attack success rate, significantly improving system robustness and deployability. Our core contributions are: (1) a systematic, RAG-specific threat model; (2) a risk-driven defensive control checklist; and (3) a lightweight, production-ready security enhancement framework.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) systems, which integrate Large Language Models (LLMs) with external knowledge sources, are vulnerable to a range of adversarial attack vectors. This paper examines the importance of RAG systems through recent industry adoption trends and identifies the prominent attack vectors for RAG: prompt injection, data poisoning, and adversarial query manipulation. We analyze these threats under risk management lens, and propose robust prioritized control list that includes risk-mitigating actions like input validation, adversarial training, and real-time monitoring.
Problem

Research questions and friction points this paper is trying to address.

Identifies adversarial threats in RAG systems
Analyzes prompt injection and data poisoning risks
Proposes mitigation strategies like adversarial training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Input validation to counter adversarial attacks
Adversarial training for enhanced robustness
Real-time monitoring for threat detection
🔎 Similar Papers
No similar papers found.