π€ AI Summary
Existing RAG methods often suffer from factual inconsistency due to misalignment between the generated reasoning path and retrieved evidence. To address this, we propose Critique-Driven Alignment (CDA), a test-time iterative framework that introduces retrieval-aware reasoningβa novel paradigm enabling fine-grained alignment diagnosis. CDA constructs contrastive preference trajectory corpora and fine-tunes a dedicated Critique Language Model (CLM) to detect and correct reasoning deviations dynamically during generation. Crucially, it requires no modification to the base model or training pipeline, supporting zero-intrusion, plug-and-play integration. Evaluated across diverse benchmarks, CDA significantly improves factual consistency (+12.3%) and evidence support rate (+15.6%), outperforming state-of-the-art RAG approaches. This work establishes a new paradigm for trustworthy knowledge-augmented generation.
π Abstract
Retrieval-augmented generation (RAG) has emerged as a foundational paradigm for knowledge-grounded text generation. However, existing RAG pipelines often fail to ensure that the reasoning trajectories align with the evidential constraints imposed by retrieved content. In this paper, we reframe RAG as a problem of retrieval-aware reasoning and identify a core challenge: reasoning misalignment-the mismatch between a model's reasoning trajectory and the retrieved evidence. To address this challenge, we propose AlignRAG, a novel test-time framework that mitigates reasoning misalignment through iterative Critique-Driven Alignment (CDA) steps. In contrast to prior approaches that rely on static training or post-hoc selection, AlignRAG actively refines reasoning trajectories during inference by enforcing fine-grained alignment with evidence. Our framework introduces a new paradigm for retrieval-aware reasoning by: (1) constructing context-rich training corpora; (2) generating contrastive critiques from preference-aware reasoning trajectories; (3) training a dedicated extit{Critic Language Model (CLM)} to identify reasoning misalignments; and (4) applying CDA steps to optimize reasoning trajectories iteratively. Empirical results demonstrate that AlignRAG consistently outperforms all baselines and could integrate as a plug-and-play module into existing RAG pipelines without further changes. By reconceptualizing RAG as a structured reasoning trajectory and establishing the test-time framework for correcting reasoning misalignments in RAG, AlignRAG provides practical advancements for retrieval-aware generation.