Adversarial threat vectors and risk mitigation for retrieval-augmented generation systems

📅 2025-05-28

🏛️ Assurance and Security for AI-enabled Systems 2025

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

RAG systems face three critical security threats: prompt injection, data poisoning, and adversarial query manipulation. To address these, we propose the first structured threat taxonomy specifically designed for RAG, coupled with a risk-quantification–driven defense prioritization mechanism—advancing RAG security from empirical practice to measurable assurance. Methodologically, we integrate input validation, adversarial training, real-time monitoring, and knowledge provenance into an end-to-end RAG security enhancement architecture. Evaluations on mainstream RAG benchmarks demonstrate an average 76.3% reduction in attack success rate, significantly improving system robustness and deployability. Our core contributions are: (1) a systematic, RAG-specific threat model; (2) a risk-driven defensive control checklist; and (3) a lightweight, production-ready security enhancement framework.

Technology Category

Application Category

📝 Abstract

Retrieval-Augmented Generation (RAG) systems, which integrate Large Language Models (LLMs) with external knowledge sources, are vulnerable to a range of adversarial attack vectors. This paper examines the importance of RAG systems through recent industry adoption trends and identifies the prominent attack vectors for RAG: prompt injection, data poisoning, and adversarial query manipulation. We analyze these threats under risk management lens, and propose robust prioritized control list that includes risk-mitigating actions like input validation, adversarial training, and real-time monitoring.

Problem

Research questions and friction points this paper is trying to address.

Identifies adversarial threats in RAG systems

Analyzes prompt injection and data poisoning risks

Proposes mitigation strategies like adversarial training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Input validation to counter adversarial attacks

Adversarial training for enhanced robustness

Real-time monitoring for threat detection

🔎 Similar Papers

A Survey of Defenses against AI-generated Visual Media: Detection, Disruption, and Authentication