Rank-K: Test-Time Reasoning for Listwise Reranking

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

Existing neural re-rankers achieve strong performance but suffer from high query-time computational overhead, poor generalization to complex queries, and multilingual support requiring task-specific fine-tuning. This paper proposes Rank-K—the first listwise re-ranker enabling test-time reasoning—where computation is dynamically allocated per query to achieve adaptive refinement. Its core innovation lies in natively integrating reasoning-capable large language models into a listwise ranking framework, coupled with multilingual unified representation learning and contrastive alignment, eliminating the need for fine-tuning to achieve cross-lingual re-ranking. Experiments demonstrate that, applied to BM25 initial rankings, Rank-K outperforms the state-of-the-art RankZephyr by 23% in NDCG@10; when initialized from the strong retriever SPLADE-v3, it yields a 19% gain. Crucially, Rank-K maintains monolingual effectiveness while achieving robust multilingual transfer—without any language-specific adaptation.

Technology Category

Application Category

📝 Abstract

Retrieve-and-rerank is a popular retrieval pipeline because of its ability to make slow but effective rerankers efficient enough at query time by reducing the number of comparisons. Recent works in neural rerankers take advantage of large language models for their capability in reasoning between queries and passages and have achieved state-of-the-art retrieval effectiveness. However, such rerankers are resource-intensive, even after heavy optimization. In this work, we introduce Rank-K, a listwise passage reranking model that leverages the reasoning capability of the reasoning language model at query time that provides test time scalability to serve hard queries. We show that Rank-K improves retrieval effectiveness by 23% over the RankZephyr, the state-of-the-art listwise reranker, when reranking a BM25 initial ranked list and 19% when reranking strong retrieval results by SPLADE-v3. Since Rank-K is inherently a multilingual model, we found that it ranks passages based on queries in different languages as effectively as it does in monolingual retrieval.

Problem

Research questions and friction points this paper is trying to address.

Improving efficiency of resource-intensive neural rerankers

Enhancing multilingual retrieval effectiveness with Rank-K

Scaling test-time reasoning for hard query handling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages reasoning language model at query time

Improves retrieval effectiveness by 23% over RankZephyr

Inherently multilingual with effective cross-language ranking

🔎 Similar Papers

No similar papers found.