PruneRAG: Confidence-Guided Query Decomposition Trees for Efficient Retrieval-Augmented Generation

📅 2026-01-16
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of evidence forgetting and inefficiency in retrieval-augmented generation (RAG) systems during multi-hop reasoning, which stem from unordered query expansion. To mitigate these issues, we propose a confidence-guided query decomposition tree approach that integrates adaptive node expansion, confidence-driven pruning, and fine-grained retrieval anchored at the entity level. This method preserves critical evidence while substantially reducing retrieval overhead. We further introduce evidence forgetting rate as a novel evaluation metric to better assess reasoning fidelity. Experimental results demonstrate that our approach consistently outperforms state-of-the-art methods across multiple multi-hop question answering benchmarks, achieving simultaneous gains in both reasoning accuracy and computational efficiency.

Technology Category

Application Category

📝 Abstract
Retrieval-augmented generation (RAG) has become a powerful framework for enhancing large language models in knowledge-intensive and reasoning tasks. However, as reasoning chains deepen or search trees expand, RAG systems often face two persistent failures: evidence forgetting, where retrieved knowledge is not effectively used, and inefficiency, caused by uncontrolled query expansions and redundant retrieval. These issues reveal a critical gap between retrieval and evidence utilization in current RAG architectures. We propose PruneRAG, a confidence-guided query decomposition framework that builds a structured query decomposition tree to perform stable and efficient reasoning. PruneRAG introduces three key mechanisms: adaptive node expansion that regulates tree width and depth, confidence-guided decisions that accept reliable answers and prune uncertain branches, and fine-grained retrieval that extracts entity-level anchors to improve retrieval precision. Together, these components preserve salient evidence throughout multi-hop reasoning while significantly reducing retrieval overhead. To better analyze evidence misuse, we define the Evidence Forgetting Rate as a metric to quantify cases where golden evidence is retrieved but not correctly used. Extensive experiments across various multi-hop QA benchmarks show that PruneRAG achieves superior accuracy and efficiency over state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation
evidence forgetting
inefficiency
query expansion
redundant retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

PruneRAG
confidence-guided pruning
query decomposition tree
evidence forgetting
fine-grained retrieval
🔎 Similar Papers
No similar papers found.
S
Shuguang Jiao
Harbin Institute of Technology, Shenzhen
X
Xinyu Xiao
Harbin Institute of Technology, Shenzhen
Y
Yunfan Wei
South China University of Technology
S
Shuhan Qi
Harbin Institute of Technology, Shenzhen and Leanplans
Chengkai Huang
Chengkai Huang
UNSW Sydney
Recommender SystemsLLMLLM Agent
Q
Quan Z. Sheng
Macquarie University
Lina Yao
Lina Yao
Science Lead at CSIRO Data61 & Professor at University of New South Wales, Australia
Machine LearningReinforcement LearningRecommender SystemsLLM AgentBrain Computer Interface