CUPID in the Model Zoo: Online Matchmaking for Selecting Your Dream LLM

📅 2026-05-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

237K/year
🤖 AI Summary
This work addresses the challenge of efficiently selecting a suitable large language model (LLM) amid the rapidly growing number of available models and their opaque characteristics, which hinder users from identifying options aligned with their implicit preferences. To tackle this, the authors propose an interaction-efficient active learning framework that integrates dueling bandit algorithms with Bayesian preference modeling. The approach employs a belief-aware upper confidence bound strategy to dynamically balance exploration and exploitation, iteratively refining model recommendations under user-specified time and cost constraints. Experimental results across multiple LLMs and real-user studies demonstrate that the method significantly reduces interaction costs while achieving more accurate personalized model matching.
📝 Abstract
Users increasingly face the challenge of selecting an appropriate LLM for a given task from a rapidly growing pool of LLMs, each with distinct but often opaque latent properties. Compounding this challenge, users may lack the vocabulary or awareness to explicitly articulate the characteristics they value in an LLM's responses or deployment. We propose an interaction-efficient active learning framework in which a dueling bandit algorithm iteratively selects pairs of LLMs, collects user feedback about their responses, and updates its belief about the user's latent preferences. We introduce a novel belief-aware upper confidence bound strategy that balances exploration of the model pool with exploitation of inferred preferences, enabling efficient alignment between user needs and LLM capabilities under user-specified cost and time budgets. Through diverse experiments on LLMs and human studies, we experimentally verify that our model can efficiently match well-aligned LLMs to users at a lower cost.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Model Selection
User Preferences
Latent Properties
Active Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

active learning
dueling bandits
LLM selection
preference elicitation
belief-aware UCB