🤖 AI Summary
This study addresses the problem of scheduling delay-sensitive tasks to spot and on-demand cloud instances under an average latency constraint, aiming to minimize average cost. By modeling the system using queueing theory and stochastic processes, and leveraging convex optimization and knapsack problem analysis, the work characterizes the optimal scheduling structure in both low- and high-latency regimes: it proves that a queue length of one is optimal in the former, while in the latter, it designs an approximation-optimal policy based on knapsack formulation. An adaptive scheduling algorithm is further proposed to dynamically exploit the allowable latency window. Experimental results demonstrate that the algorithm achieves near-theoretical-optimal cost while effectively balancing latency constraints and resource expenditure. This work provides the first analytical solution for scheduling across hybrid spot and on-demand instances under latency guarantees.
📝 Abstract
We study the problem of scheduling delay-sensitive jobs over spot and on-demand cloud instances to minimize average cost while meeting an average delay constraint. Jobs arrive as a general stochastic process, and incur different costs based on the instance type. This work provides the first analytical treatment of this problem using tools from queuing theory, stochastic processes, and optimization. We derive cost expressions for general policies, prove queue length one is optimal for low target delays, and characterize the optimal wait-time distribution. For high target delays, we identify a knapsack structure and design a scheduling policy that exploits it. An adaptive algorithm is proposed to fully utilize the allowed delay, and empirical results confirm its near-optimality.