🤖 AI Summary
This paper addresses dynamic scheduling of parallel servers under a fluid model, where task-size information is leveraged to minimize average delay. For stochastic, discrete task arrivals, we formulate a fluid approximation and apply optimal control theory combined with variational methods to rigorously characterize the optimal size-aware routing structure: the optimal trajectory evolves along a specific piecewise-smooth manifold in high-dimensional state space, embodying a synergistic interplay between size prioritization and load balancing. We derive an explicit mathematical characterization of the optimal policy. Numerical experiments demonstrate that this policy significantly reduces average delay compared to classical heuristics—by 18%–35% relative to Join-the-Shortest-Queue (JSQ) and Shortest-Remaining-Processing-Time (SRPT). The work establishes a provably optimal theoretical framework and design principles for real-time, size-aware scheduling in large-scale distributed service systems.
📝 Abstract
We develop a fluid-flow model for routing problems, where fluid consists of different size particles and the task is to route the incoming fluid to $n$ parallel servers using the size information in order to minimize the mean latency. The problem corresponds to the dispatching problem of (discrete) jobs arriving according to a stochastic process. In the fluid model the problem reduces to finding an optimal path to empty the system in $n$-dimensional space. We use the calculus of variation to characterize the structure of optimal policies. Numerical examples shed further light on the fluid routing problem and the optimal control of large distributed service systems.