🤖 AI Summary
This work addresses the challenge of scheduling graph-structured workflows with dynamically arriving tasks and heterogeneous deadlines onto time-varying cloud resources. To tackle this problem, the authors propose DEFT, a deep reinforcement learning scheduler based on a Mixture-of-Experts (MoE) architecture. DEFT introduces a deadline-aware expert specialization mechanism and a graph-adaptive gating strategy, leveraging graph neural networks and cross-attention to dynamically route tasks to the most suitable experts, thereby enabling fine-grained, deadline-sensitive scheduling decisions. As the first approach to integrate MoE into dynamic cloud workflow scheduling, DEFT significantly reduces both execution cost and deadline violation rates on standard benchmarks, outperforming state-of-the-art deep reinforcement learning schedulers.
📝 Abstract
Workflow scheduling in cloud computing demands the intelligent allocation of dynamically arriving, graph-structured workflows with varying deadlines onto ever-changing virtual machine resources. However, existing deep reinforcement learning (DRL) schedulers remain limited by rigid, single-path inference architectures that struggle to handle diverse scheduling scenarios. We introduce \textbf{DEFT} (\textbf{D}eadline-p\textbf{E}rceptive Mixture-o\textbf{F}-Exper\textbf{t}s), an innovative DRL policy architecture that leverages a specialized mixture of experts, each trained to manage different levels of deadline tightness. To our knowledge, DEFT is the first to introduce and validate a Mixture-of-Experts architecture for dynamic cloud workflow scheduling. By adaptively routing decisions through the most appropriate experts, DEFT is capable of meeting a broad spectrum of deadline requirements that no single expert can achieve. Central to DEFT is a \textbf{graph-adaptive} gating mechanism that encodes workflow deadlines and DAGs, task states, and VM conditions, using cross-attention to guide expert activation in a fine-grained, deadline-sensitive manner. Experiments on dynamic cloud workflow benchmarks demonstrate that DEFT significantly reduces execution cost and deadline violations, outperforming multiple state-of-the-art DRL baselines.