Are Optimal Algorithms Still Optimal? Rethinking Sorting in LLM-Based Pairwise Ranking with Batching and Caching

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

In large language model (LLM)-driven pairwise ranking (PRP), conventional evaluation paradigms—based on comparison count—fail when LLM inference cost dominates overall computational expense. Method: We propose a novel evaluation framework centered on LLM inference overhead, redefining algorithmic complexity for PRP and revealing, for the first time, that classical O(n log n) sorting algorithms can underperform O(n²) alternatives under high inference costs. Our approach introduces a synergistic optimization paradigm integrating batched processing and response caching, coupled with dynamic batch scheduling and fine-grained LLM inference modeling. Contribution/Results: Experiments demonstrate up to 47% reduction in LLM invocations and substantial throughput improvement in PRP systems. This work establishes both theoretical foundations and practical engineering pathways for LLM-native ranking systems.

Technology Category

Application Category

📝 Abstract

We introduce a novel framework for analyzing sorting algorithms in pairwise ranking prompting (PRP), re-centering the cost model around LLM inferences rather than traditional pairwise comparisons. While classical metrics based on comparison counts have traditionally been used to gauge efficiency, our analysis reveals that expensive LLM inferences overturn these predictions; accordingly, our framework encourages strategies such as batching and caching to mitigate inference costs. We show that algorithms optimal in the classical setting can lose efficiency when LLM inferences dominate the cost under certain optimizations.

Problem

Research questions and friction points this paper is trying to address.

Analyzing sorting algorithms in LLM-based pairwise ranking

Shifting cost model from comparisons to LLM inferences

Re-evaluating algorithm efficiency with batching and caching

Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel framework for LLM-based pairwise ranking

Focus on batching and caching to reduce costs

Re-evaluate classical algorithms under LLM inference costs

🔎 Similar Papers

TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy