🤖 AI Summary
This work addresses the challenge of achieving an optimal trade-off between cost and reliability for deadline-constrained jobs in hybrid cloud environments by strategically leveraging inexpensive but unreliable spot instances alongside costly yet dependable on-demand instances. The authors propose ROSS, a randomized online scheduling algorithm, and establish the first lower bound of Ω(K) on the competitive ratio for any deterministic policy. They further design a randomized strategy that attains the theoretically optimal competitive ratio of √K. Grounded in online algorithm theory and competitive analysis, ROSS is empirically validated using real-world spot market data from Azure and AWS, demonstrating up to 30% cost savings over state-of-the-art approaches across diverse market conditions while strictly meeting hard deadline constraints.
📝 Abstract
This paper addresses the challenge of deadline-aware online scheduling for jobs in hybrid cloud environments, where jobs may run on either cost-effective but unreliable spot instances or more expensive on-demand instances, under hard deadlines. We first establish a fundamental limit for existing (predominantly-) deterministic policies, proving a worst-case competitive ratio of $\Omega(K)$, where $K$ is the cost ratio between on-demand and spot instances. We then present a novel randomized scheduling algorithm, ROSS, that achieves a provably optimal competitive ratio of $\sqrt{K}$ under reasonable deadlines, significantly improving upon existing approaches. Extensive evaluations on real-world trace data from Azure and AWS demonstrate that ROSS effectively balances cost optimization and deadline guarantees, consistently outperforming the state-of-the-art by up to $30\%$ in cost savings, across diverse spot market conditions.