🤖 AI Summary
This work addresses the limitations of traditional Pairwise Ranking Prompting (PRP) methods under constrained LLM query budgets, where noise, order sensitivity, and non-transitivity in pairwise preference judgments degrade top-K ranking quality. The authors reformulate PRP-based reranking as an active learning problem under noisy pairwise comparisons and propose a noise-robust active ranking framework. By leveraging a stochastic-direction oracle that requires only a single LLM call per comparison, the approach transforms systematic positional biases into zero-mean noise, enabling unbiased aggregation without the overhead of bidirectional querying. Experimental results demonstrate that the method substantially improves NDCG@10 per-query efficiency, confirming the effectiveness of the proposed active ranker as a plug-and-play component for enhancing LLM-based ranking systems.
📝 Abstract
Pairwise Ranking Prompting (PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet a call budget does not produce a dependable top-K. We thus reframe PRP reranking as active learning from noisy pairwise comparisons and show that active rankers are drop-in replacements that improve NDCG@10 per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematic position bias into zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.