🤖 AI Summary
To address the vulnerability of large language models (LLMs) to retrieval noise and their inefficient knowledge utilization in retrieval-augmented generation (RAG), this paper proposes RankCoT—a ranking-enhanced chain-of-thought method. RankCoT explicitly incorporates retrieval re-ranking signals into the CoT generation process, enabling end-to-end knowledge filtering; it further introduces a CoT-level self-reflection mechanism to dynamically refine reasoning paths. Its core innovations are twofold: (1) the first integration of re-ranking supervision directly into CoT generation—bypassing post-hoc correction—and (2) fine-grained, learnable refinement of reasoning chains. Evaluated across multiple RAG benchmarks, RankCoT achieves significant improvements in answer accuracy while producing shorter, more precise, and higher-quality reasoning outputs, consistently outperforming existing knowledge refinement approaches.
📝 Abstract
Retrieval-Augmented Generation (RAG) enhances the performance of Large Language Models (LLMs) by incorporating external knowledge. However, LLMs still encounter challenges in effectively utilizing the knowledge from retrieved documents, often being misled by irrelevant or noisy information. To address this issue, we introduce RankCoT, a knowledge refinement method that incorporates reranking signals in generating CoT-based summarization for knowledge refinement based on given query and all retrieval documents. During training, RankCoT prompts the LLM to generate Chain-of-Thought (CoT) candidates based on the query and individual documents. It then fine-tunes the LLM to directly reproduce the best CoT from these candidate outputs based on all retrieved documents, which requires LLM to filter out irrelevant documents during generating CoT-style summarization. Additionally, RankCoT incorporates a self-reflection mechanism that further refines the CoT outputs, resulting in higher-quality training data. Our experiments demonstrate the effectiveness of RankCoT, showing its superior performance over other knowledge refinement models. Further analysis reveals that RankCoT can provide shorter but effective refinement results, enabling the generator to produce more accurate answers. All code and data are available at https://github.com/NEUIR/RankCoT.