🤖 AI Summary
To address the low efficiency of manual parallel workflow scheduling and poor cross-domain interoperability on heterogeneous resources within the computational continuum (IoT/edge/cloud/HPC convergence), this paper proposes the first unified workflow-driven modeling and scheduling framework tailored for the computational continuum. Our approach integrates system-and-workload co-modeling with cross-domain resource abstraction and mapping, and combines a mixed-integer linear programming (MILP) solver with lightweight heuristic algorithms. For small-scale scenarios, it achieves optimal scheduling and minimal makespan; for large-scale ones, it accelerates scheduling by 99% while maintaining solution quality within a 5–10% deviation from optimality. The framework significantly reduces end-to-end latency and improves resource utilization, thereby bridging a critical research gap in automated modeling and joint optimization for cloud–HPC collaborative scheduling.
📝 Abstract
The convergence of IoT, Edge, Cloud, and HPC technologies creates a compute continuum that merges cloud scalability and flexibility with HPC's computational power and specialized optimizations. However, integrating cloud and HPC resources often introduces latency and communication overhead, which can hinder the performance of tightly coupled parallel applications. Additionally, achieving seamless interoperability between cloud and on-premises HPC systems requires advanced scheduling, resource management, and data transfer protocols. Consequently, users must manually allocate complex workloads across heterogeneous resources, leading to suboptimal task placement and reduced efficiency due to the absence of an automated scheduling mechanism. To overcome these challenges, we introduce a comprehensive framework based on rigorous system and workload modeling for the compute continuum. Our method employs established tools and techniques to optimize workload mapping and scheduling, enabling the automatic orchestration of tasks across both cloud and HPC infrastructures. Experimental evaluations reveal that our approach could optimally improve scheduling efficiency, reducing execution times, and enhancing resource utilization. Specifically, our MILP-based solution achieves optimal scheduling and makespan for small-scale workflows, while heuristic methods offer up to 99% faster estimations for large-scale workflows, albeit with a 5-10% deviation from optimal results. Our primary contribution is a robust system and workload modeling framework that addresses critical gaps in existing tools, paving the way for fully automated orchestration in HPC-compute continuum environments.