🤖 AI Summary
This work addresses the open challenge of efficiently routing requests in the Internet of Agents under constraints of latency, privacy, and cost. The authors propose AgentGate, the first framework to formalize agent routing as a constrained structured decision-making task, employing a two-stage architecture that first makes an action decision and then generates a structured output. To enable effective deployment of compact models in resource-constrained settings, the method incorporates candidate-aware supervision and hard negative fine-tuning strategies. Evaluated on a newly introduced routing benchmark, lightweight models ranging from 3B to 7B parameters demonstrate competitive performance in action prediction, candidate selection, and structured output generation.
📝 Abstract
The rapid development of AI agent systems is leading to an emerging Internet of Agents, where specialized agents operate across local devices, edge nodes, private services, and cloud platforms. Although recent efforts have improved agent naming, discovery, and interaction, efficient request dispatch remains an open systems problem under latency, privacy, and cost constraints. In this paper, we present AgentGate, a lightweight structured routing engine for candidate-aware agent dispatch. Instead of treating routing as unrestricted text generation, AgentGate formulates it as a constrained decision problem and decomposes it into two stages: action decision and structural grounding. The first stage determines whether a query should trigger single-agent invocation, multi-agent planning, direct response, or safe escalation, while the second stage instantiates the selected action into executable outputs such as target agents, structured arguments, or multi-step plans. To adapt compact models to this setting, we further develop a routing-oriented fine-tuning scheme with candidate-aware supervision and hard negative examples. Experiments on a curated routing benchmark with several 3B--7B open-weight models show that compact models can provide competitive routing performance in constrained settings, and that model differences are mainly reflected in action prediction, candidate selection, and structured grounding quality. These results indicate that structured routing is a feasible design point for efficient and privacy-aware agent systems, especially when routing decisions must be made under resource-constrained deployment conditions.