🤖 AI Summary
Existing neural re-rankers achieve strong performance but suffer from high query-time computational overhead, poor generalization to complex queries, and multilingual support requiring task-specific fine-tuning. This paper proposes Rank-K—the first listwise re-ranker enabling test-time reasoning—where computation is dynamically allocated per query to achieve adaptive refinement. Its core innovation lies in natively integrating reasoning-capable large language models into a listwise ranking framework, coupled with multilingual unified representation learning and contrastive alignment, eliminating the need for fine-tuning to achieve cross-lingual re-ranking. Experiments demonstrate that, applied to BM25 initial rankings, Rank-K outperforms the state-of-the-art RankZephyr by 23% in NDCG@10; when initialized from the strong retriever SPLADE-v3, it yields a 19% gain. Crucially, Rank-K maintains monolingual effectiveness while achieving robust multilingual transfer—without any language-specific adaptation.
📝 Abstract
Retrieve-and-rerank is a popular retrieval pipeline because of its ability to make slow but effective rerankers efficient enough at query time by reducing the number of comparisons. Recent works in neural rerankers take advantage of large language models for their capability in reasoning between queries and passages and have achieved state-of-the-art retrieval effectiveness. However, such rerankers are resource-intensive, even after heavy optimization. In this work, we introduce Rank-K, a listwise passage reranking model that leverages the reasoning capability of the reasoning language model at query time that provides test time scalability to serve hard queries. We show that Rank-K improves retrieval effectiveness by 23% over the RankZephyr, the state-of-the-art listwise reranker, when reranking a BM25 initial ranked list and 19% when reranking strong retrieval results by SPLADE-v3. Since Rank-K is inherently a multilingual model, we found that it ranks passages based on queries in different languages as effectively as it does in monolingual retrieval.