SCOPE: Cost-Efficient Model Selection for Compound AI Systems under Quality Constraints

📅 2026-05-30

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This work addresses the challenge of selecting modular large language models (LLMs) in composite AI systems by proposing an efficient algorithm that jointly optimizes cost and quality constraints. The method leverages per-query performance estimation to rapidly evaluate overall system quality and cost, guiding the search with confidence bounds to minimize average invocation cost while ensuring output quality meets or exceeds a user-specified threshold. Unlike conventional approaches relying on costly dataset-level evaluations, the proposed algorithm offers theoretical guarantees of near-optimality. Experiments across three tasks demonstrate that, under identical search budgets and quality constraints, the approach reduces search-phase costs by up to 20× and final solution costs by up to 6×, significantly outperforming seven baseline methods.

📝 Abstract

A compound AI system consists of multiple LLM modules, together handling complex and multi-step tasks that exceed the capabilities of a single model. Existing systems often use a single expensive LLM across all modules to improve the result quality of the whole system. However, this configuration incurs prohibitive costs, particularly for data management and analytics tasks at scale, such as data manipulation. To this end, we formalize the problem of constrained LLM selection for compound AI systems, leveraging the diverse pricing and capabilities of different LLMs to achieve competitive quality at lower cost. Given a query dataset and a user-specified quality threshold, we aim to select an LLM for each module to minimize the system's average cost while ensuring that overall quality meets the required threshold. To solve this problem, we propose SCOPE, a cost-efficient optimization algorithm. Unlike existing approaches that rely on expensive dataset-level evaluations, SCOPE exploits per-query results to rapidly estimate the system's cost and quality, and constructs confidence bounds to guide the search for promising LLM combinations. Furthermore, SCOPE provides theoretical guarantees for meeting the quality threshold and achieving near-optimal average cost. We evaluate SCOPE against 7 baselines on three data processing tasks, demonstrating that it outperforms all baselines. Under the same search budget and quality constraint, it finds solutions with up to $20\times$ lower cost than the best competitor during the search and achieves up to $6\times$ lower final cost in the returned solution.

Problem

Research questions and friction points this paper is trying to address.

compound AI systems

LLM selection

cost efficiency

quality constraints

model selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

compound AI systems

cost-efficient LLM selection

quality-constrained optimization