🤖 AI Summary
Document re-ranking in information retrieval suffers from poor interpretability, heavy reliance on proprietary large language models (LLMs), and limited reproducibility. To address these issues, we propose Reason-to-Rank (R2R), a novel framework featuring a dual-path LLM reasoning mechanism—comprising *direct relevance reasoning* and *pairwise comparative reasoning*—integrated with structured prompt engineering and multi-objective supervised knowledge distillation. R2R transfers sophisticated reasoning capabilities from LLMs to lightweight student models (e.g., BERT, DeBERTa). It establishes the first open-source, fully reproducible re-ranking distillation paradigm that simultaneously achieves high effectiveness and transparency. Evaluated on MSMARCO and BRIGHT benchmarks, R2R attains state-of-the-art or near-state-of-the-art performance while generating human-readable, step-by-step justifications for every ranking decision—thereby substantially enhancing model trustworthiness and debuggability.
📝 Abstract
Reranking documents based on their relevance to a given query is a critical task in information retrieval. Traditional reranking methods often lack transparency and rely on proprietary models, hindering reproducibility and interpretability. We propose Reason-to-Rank (R2R), a novel open-source reranking approach that enhances transparency by generating two types of reasoning: direct relevance reasoning, which explains how a document addresses the query, and comparison reasoning, which justifies the relevance of one document over another. We leverage large language models (LLMs) as teacher models to generate these explanations and distill this knowledge into smaller, openly available student models. Our student models are trained to generate meaningful reasoning and rerank documents, achieving competitive performance across multiple datasets, including MSMARCO and BRIGHT. Experiments demonstrate that R2R not only improves reranking accuracy but also provides valuable insights into the decision-making process. By offering a structured and interpretable solution with openly accessible resources, R2R aims to bridge the gap between effectiveness and transparency in information retrieval, fostering reproducibility and further research in the field.