🤖 AI Summary
To address insufficient evidence coverage and low answer accuracy in Retrieval-Augmented Generation (RAG) for government document question answering within legal and regulatory domains, this paper proposes two synergistic optimization strategies. First, a One-SHOT retrieval method with adaptive token budgeting improves recall of critical information chunks. Second, an iterative retrieval framework built upon a Reasoning Agentic RAG architecture integrates dynamic query generation, progressive context refinement, and result evaluation with feedback—effectively mitigating query drift and retrieval inertia. Experimental results demonstrate substantial improvements: +28.6% in evidence coverage and +14.3 BLEU points in answer accuracy. These advances establish a novel paradigm for high-precision, interpretable legal intelligent question answering.
📝 Abstract
Retrieval-Augmented Generation (RAG) based on Large Language Models (LLMs) is a powerful solution to understand and query the industry's closed-source documents. However, basic RAG often struggles with complex QA tasks in legal and regulatory domains, particularly when dealing with numerous government documents. The top-$k$ strategy frequently misses golden chunks, leading to incomplete or inaccurate answers. To address these retrieval bottlenecks, we explore two strategies to improve evidence coverage and answer quality. The first is a One-SHOT retrieval method that adaptively selects chunks based on a token budget, allowing as much relevant content as possible to be included within the model's context window. Additionally, we design modules to further filter and refine the chunks. The second is an iterative retrieval strategy built on a Reasoning Agentic RAG framework, where a reasoning LLM dynamically issues search queries, evaluates retrieved results, and progressively refines the context over multiple turns. We identify query drift and retrieval laziness issues and further design two modules to tackle them. Through extensive experiments on a dataset of government documents, we aim to offer practical insights and guidance for real-world applications in legal and regulatory domains.