Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval

📅 2025-10-14

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

Generative retrieval (GR) has traditionally relied solely on large language models’ (LLMs) generative capacity, overlooking their latent reasoning capabilities. Method: We propose Reason-for-Retrieval (R4R), the first GR framework to explicitly integrate structured, iterative reasoning chains into the retrieval process: an instruction-tuned LLM dynamically constructs and refines a query-aligned reasoning path before generating document IDs, using constrained decoding and alternating reasoning-retrieval optimization—without auxiliary models or additional training. Contribution/Results: Evaluated on Natural Questions, MS MARCO, and a real-world e-commerce search benchmark, R4R consistently outperforms state-of-the-art GR methods. Results demonstrate that incorporating controllable, interpretable reasoning significantly enhances retrieval relevance, establishing its effectiveness and broad applicability across diverse domains.

Technology Category

Application Category

📝 Abstract

Generative retrieval (GR) is an emerging paradigm that leverages large language models (LLMs) to autoregressively generate document identifiers (docids) relevant to a given query. Prior works have focused on leveraging the generative capabilities of LLMs to improve GR, while overlooking that their reasoning capabilities could likewise help. This raises a key question: Can explicit reasoning benefit GR? To investigate, we first conduct a preliminary study where an LLM is prompted to generate free-form chain-of-thought (CoT) reasoning before performing constrained docid decoding. Although this method outperforms standard GR, the generated reasoning tends to be verbose and poorly aligned with the docid space. These limitations motivate the development of a reasoning mechanism better tailored to GR. Therefore, we propose Reason-for-Retrieval (R4R), a reasoning-augmented framework for GR that converts free-form CoT reasoning into a compact, structured format, and iteratively refines the reasoning during the retrieval process. R4R augments an existing GR method by leveraging a reasoning-capable LLM that has been instruction-tuned for GR. At inference time, R4R first uses the LLM to generate an initial structured reasoning; then the same LLM alternates between (i) constrained decoding with the chosen GR method to produce candidate docids and (ii) updating the reasoning based on retrieval results to improve the next round. R4R does not require additional models or training, and instead a single LLM serves as both the reasoning generator and the retriever. Extensive experiments on Natural Questions, MS MARCO, and a real-world item-search benchmark validate the effectiveness of R4R.

Problem

Research questions and friction points this paper is trying to address.

Enhancing generative retrieval using large language models' reasoning capabilities

Converting free-form reasoning into structured format for retrieval alignment

Iteratively refining reasoning during retrieval without additional training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured reasoning replaces free-form chain-of-thought

Iteratively refines reasoning during retrieval process

Single LLM serves as both reasoner and retriever

🔎 Similar Papers

Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval