Mastering Multiple-Expert Routing: Realizable $H$-Consistency and Strong Guarantees for Learning to Defer

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the dynamic routing problem of input instances in multi-expert systems, aiming to jointly optimize prediction accuracy and computational latency. Motivated by applications in natural language generation, image processing, and medical diagnosis, we propose the first proxy loss function achieving *H*-consistency—unifying solutions to long-standing open problems including *H*-consistency, Bayes-consistency, and generalization bounds under both single-stage (joint learning of predictor and latency model) and two-stage (fixed experts, learning only the latency model) settings. Theoretically, we derive a generalization error bound under low-noise conditions. Algorithmically, we design an efficient routing mechanism with formal consistency guarantees. Extensive experiments across multiple benchmark tasks demonstrate that our method significantly outperforms existing baselines, achieving both theoretical rigor and empirical robustness.

Technology Category

Application Category

📝 Abstract
The problem of learning to defer with multiple experts consists of optimally assigning input instances to experts, balancing the trade-off between their accuracy and computational cost. This is a critical challenge in natural language generation, but also in other fields such as image processing, and medical diagnostics. Recent studies have proposed surrogate loss functions to optimize deferral, but challenges remain in ensuring their consistency properties. This paper introduces novel surrogate loss functions and efficient algorithms with strong theoretical learning guarantees. We address open questions regarding realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for both single-stage (jointly learning predictor and deferral function) and two-stage (learning only the deferral function with a fixed expert) learning scenarios. For single-stage deferral, we introduce a family of new realizable $H$-consistent surrogate losses and further prove $H$-consistency for a selected member. For two-stage deferral, we derive new surrogate losses that achieve realizable $H$-consistency, $H$-consistency bounds, and Bayes-consistency for the two-expert scenario and, under natural assumptions, multiple-expert scenario. Additionally, we provide enhanced theoretical guarantees under low-noise assumptions for both scenarios. Finally, we report the results of experiments using our proposed surrogate losses, comparing their performance against existing baselines.
Problem

Research questions and friction points this paper is trying to address.

Optimizing expert assignment balancing accuracy and cost
Ensuring consistency in surrogate loss functions
Developing learning guarantees for single and two-stage deferral
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel surrogate loss functions for deferral
Efficient algorithms with theoretical guarantees
Realizable H-consistency for multiple experts
🔎 Similar Papers
No similar papers found.