🤖 AI Summary
Listwise reranking in information retrieval suffers from limited performance, poor interpretability, and heavy reliance on large-scale annotated data. Method: This paper proposes REARANK-7B, a reasoning-prioritized reranking agent built upon Qwen2.5-7B, integrating reinforcement learning optimization and lightweight data augmentation to achieve strong generalization with only 179 labeled samples. Its core innovation is the “reasoning-prior” paradigm—explicitly modeling list-level semantic relationships and decision logic to jointly enhance ranking accuracy and process interpretability. Results: REARANK-7B significantly outperforms conventional baselines across mainstream IR benchmarks. It matches GPT-4’s performance on in-domain and out-of-domain tasks, and—critically—achieves the first reported superiority over GPT-4 on the inference-intensive BRIGHT benchmark, demonstrating both efficiency and robustness in complex, reasoning-heavy retrieval scenarios.
📝 Abstract
We present REARANK, a large language model (LLM)-based listwise reasoning reranking agent. REARANK explicitly reasons before reranking, significantly improving both performance and interpretability. Leveraging reinforcement learning and data augmentation, REARANK achieves substantial improvements over baseline models across popular information retrieval benchmarks, notably requiring only 179 annotated samples. Built on top of Qwen2.5-7B, our REARANK-7B demonstrates performance comparable to GPT-4 on both in-domain and out-of-domain benchmarks and even surpasses GPT-4 on reasoning-intensive BRIGHT benchmarks. These results underscore the effectiveness of our approach and highlight how reinforcement learning can enhance LLM reasoning capabilities in reranking.