Grounded Decoding: Retrieval-Anchored Probability Fusion for Faithful RAG

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

150K/year

🤖 AI Summary

This work addresses the challenge in retrieval-augmented generation (RAG) where large language models often produce factually inconsistent outputs due to conflicts between their internal knowledge and external retrieved evidence. The authors propose a training-free decoding framework that, at each generation step, constructs two distributions—one from the full RAG model and another conditioned solely on the retrieved evidence—and dynamically fuses them via normalized geometric ensembling using a KL barycenter. This approach incorporates a novel conflict-aware adaptive weighting mechanism and a retrieval confidence estimator to prioritize faithful generation. Evaluated on the ALCE, Natural Questions, and FActScore benchmarks, the method significantly improves factual accuracy and citation quality over standard RAG and existing decoding strategies, while preserving fluent and coherent text generation.

📝 Abstract

As retrieval-augmented generation (RAG) systems scale, it becomes increasingly challenging to ensure faithful grounding in external evidence. Large language models may still prioritize parametric knowledge over retrieved information when conflicts arise. We propose a novel training-free decoding framework, \emph{Grounded Decoding}, designed to improve factual consistency in RAG without modifying model parameters. Unlike standard approaches that rely on a single conditional distribution, our method constructs two matched-prompt distributions at every generation step: (1) a full RAG distribution conditioned on the query, retrieved documents, and generated prefix, and (2) a retrieval-only distribution conditioned solely on retrieved evidence and the same prefix. The final next-token distribution is derived as the unique solution to a KL-barycenter objective over the probability simplex, yielding a normalized geometric fusion of the two distributions.This formulation naturally recovers standard RAG when the grounding weight is zero and smoothly shifts probability mass toward retrieved evidence as grounding strength increases. We further introduce a conflict-aware adaptive weighting scheme that dynamically adjusts grounding based on distributional disagreement and retriever confidence. Experiments on ALCE, Natural Questions, and FActScore demonstrate consistent improvements in factual accuracy and citation quality over standard RAG and competitive decoding-time baselines, while maintaining fluency. Our results indicate that probability-level fusion provides a strong and efficient alternative to logit-level intervention methods for faithful RAG decoding.

Problem

Research questions and friction points this paper is trying to address.

retrieval-augmented generation

faithful grounding

factual consistency

evidence conflict

RAG

Innovation

Methods, ideas, or system contributions that make the work stand out.

Grounded Decoding

Retrieval-Augmented Generation

Probability Fusion