CASTER: Breaking the Cost-Performance Barrier in Multi-Agent Orchestration via Context-Aware Strategy for Task Efficient Routing

📅 2026-01-27

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the inefficiency of static model allocation in graph-structured multi-agent systems, which often wastes computational resources on simple subtasks and struggles to balance cost and performance. To this end, we propose CASTER—a lightweight, context-aware routing strategy that dynamically assesses task difficulty by fusing semantic embeddings and graph-structural meta-features through a dual-signal router, thereby selecting models of appropriate capability for each subtask. CASTER employs a self-optimizing training paradigm that evolves from cold start to iterative refinement, leveraging LLM-as-a-Judge evaluation and a self-supervised negative feedback mechanism to continuously improve routing decisions. Experiments across software engineering, data analysis, scientific discovery, and cybersecurity demonstrate that CASTER achieves comparable success rates to full-capability model baselines while reducing inference costs by up to 72.4%, significantly outperforming heuristic routing and FrugalGPT.

Technology Category

Application Category

📝 Abstract

Graph-based Multi-Agent Systems (MAS) enable complex cyclic workflows but suffer from inefficient static model allocation, where deploying strong models uniformly wastes computation on trivial sub-tasks. We propose CASTER (Context-Aware Strategy for Task Efficient Routing), a lightweight router for dynamic model selection in graph-based MAS. CASTER employs a Dual-Signal Router that combines semantic embeddings with structural meta-features to estimate task difficulty. During training, the router self-optimizes through a Cold Start to Iterative Evolution paradigm, learning from its own routing failures via on-policy negative feedback. Experiments using LLM-as-a-Judge evaluation across Software Engineering, Data Analysis, Scientific Discovery, and Cybersecurity demonstrate that CASTER reduces inference cost by up to 72.4% compared to strong-model baselines while matching their success rates, and consistently outperforms both heuristic routing and FrugalGPT across all domains.

Problem

Research questions and friction points this paper is trying to address.

Multi-Agent Systems

Model Allocation

Task Routing

Cost-Performance Tradeoff

Graph-based Workflows

Innovation

Methods, ideas, or system contributions that make the work stand out.

Context-Aware Routing

Multi-Agent Orchestration

Dynamic Model Selection