Beyond Correctness: Rewarding Faithful Reasoning in Retrieval-Augmented Generation

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing reinforcement learning (RL)-based retrieval-augmented generation (RAG) methods overly prioritize final answer correctness while neglecting consistency among intermediate reasoning steps, retrieved evidence, and the generated answer—leading to unfaithful chain-of-thought (CoT) reasoning. Method: We propose VERITAS, the first framework to integrate three fine-grained faithfulness metrics—information-to-reasoning, reasoning-to-answer, and reasoning-to-search—into the RL reward mechanism, enabling traceable agent-based search-and-reasoning verification. Our approach employs a faithfulness-driven reward function and a multi-dimensional evaluation framework. Contribution/Results: Evaluated on seven open-domain QA benchmarks, VERITAS significantly improves reasoning-path faithfulness while maintaining answer accuracy comparable to state-of-the-art (SOTA) methods, empirically demonstrating that faithfulness and performance are jointly attainable.

Technology Category

Application Category

📝 Abstract
Inspired by the success of reinforcement learning (RL) in Large Language Model (LLM) training for domains like math and code, recent works have begun exploring how to train LLMs to use search engines more effectively as tools for retrieval-augmented generation. Although these methods achieve performance improvement across QA benchmarks, many prioritize final answer correctness while overlooking the quality of intermediate reasoning steps, which may lead to chain-of-thought unfaithfulness. In this paper, we first introduce a comprehensive evaluation framework for evaluating RL-based search agents, covering three distinct faithfulness metrics: information-think faithfulness, think-answer faithfulness, and think-search faithfulness. Our evaluations reveal that a prototypical RL-based search agent, Search-R1, has significant room for improvement in this regard. To foster faithful reasoning, we introduce VERITAS (Verifying Entailed Reasoning through Intermediate Traceability in Agentic Search), a novel framework that integrates fine-grained faithfulness rewards into the reinforcement learning process. Our experiments show that models trained with VERITAS not only significantly improve reasoning faithfulness, but also achieve comparable task performance across seven QA benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Training LLMs to improve reasoning faithfulness in retrieval-augmented generation
Addressing chain-of-thought unfaithfulness in RL-based search agents
Integrating fine-grained faithfulness metrics into reinforcement learning training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates fine-grained faithfulness rewards into reinforcement learning
Verifies entailed reasoning through intermediate traceability
Improves reasoning faithfulness while maintaining task performance
🔎 Similar Papers
No similar papers found.