SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions

📅 2026-03-07

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Current agentic RAG systems lack a unified theoretical framework, resulting in architectural fragmentation, inconsistent evaluation protocols, and insufficiently characterized reliability risks. This work addresses these challenges by formally modeling agentic RAG as a finite-horizon partially observable Markov decision process. It introduces a modular architecture and a systematic taxonomy encompassing core components such as planning, retrieval coordination, memory paradigms, and tool invocation. By analyzing the limitations of static evaluation and identifying dynamic risks inherent in autonomous loops—particularly hallucination propagation and memory contamination—the study establishes a theoretical foundation for agentic RAG and proposes reliability-oriented evaluation criteria. Furthermore, it outlines key directions for future research, including adaptive retrieval strategies, cost-aware coordination mechanisms, and effective supervision frameworks.

Technology Category

Application Category

📝 Abstract

Retrieval-Augmented Generation (RAG) systems are increasingly evolving into agentic architectures where large language models autonomously coordinate multi-step reasoning, dynamic memory management, and iterative retrieval strategies. Despite rapid industrial adoption, current research lacks a systematic understanding of Agentic RAG as a sequential decision-making system, leading to highly fragmented architectures, inconsistent evaluation methodologies, and unresolved reliability risks. This Systematization of Knowledge (SoK) paper provides the first unified framework for understanding these autonomous systems. We formalize agentic retrieval-generation loops as finite-horizon partially observable Markov decision processes, explicitly modeling their control policies and state transitions. Building upon this formalization, we develop a comprehensive taxonomy and modular architectural decomposition that categorizes systems by their planning mechanisms, retrieval orchestration, memory paradigms, and tool-invocation behaviors. We further analyze the critical limitations of traditional static evaluation practices and identify severe systemic risks inherent to autonomous loops, including compounding hallucination propagation, memory poisoning, retrieval misalignment, and cascading tool-execution vulnerabilities. Finally, we outline key doctoral-scale research directions spanning stable adaptive retrieval, cost-aware orchestration, formal trajectory evaluation, and oversight mechanisms, providing a definitive roadmap for building reliable, controllable, and scalable agentic retrieval systems.

Problem

Research questions and friction points this paper is trying to address.

Agentic RAG

systematic understanding

fragmented architectures

inconsistent evaluation

reliability risks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic RAG

POMDP formalization

modular architecture