🤖 AI Summary
This work addresses the resource scheduling challenges faced by hybrid quantum-classical applications in heterogeneous and dynamic computing environments, where existing high-performance computing (HPC) schedulers lack application semantics awareness and runtime adaptability. To overcome these limitations, the authors propose a four-layer middleware architecture that integrates an abstract execution model for hybrid applications, a Pilot-Quantum dynamic scheduling framework, and the Q-Dreamer performance modeling toolkit. This integrated approach enables application-aware, adaptive resource management and optimized quantum circuit partitioning. The system supports coordinated scheduling across CPU, GPU, and quantum processing unit (QPU) backends and has been validated on the Perlmutter and NVIDIA DGX platforms. Experimental results demonstrate that Q-Dreamer achieves an 82% accuracy rate in predicting optimal circuit-cut configurations.
📝 Abstract
Hybrid quantum-classical applications pose significant resource management challenges due to heterogeneity and dynamism in both infrastructure and workloads. Quantum-HPC environments integrate quantum processing units (QPUs) with diverse classical resources (CPUs, GPUs), while applications span coupling patterns from tightly coupled execution to loosely coupled task parallelism with varying resource requirements. Traditional HPC schedulers lack visibility into application semantics and cannot respond to fluctuating resource availability at runtime. This paper presents a middleware-based approach for adaptive resource, workload, and task management in hybrid quantum-HPC systems. We make four contributions: (i) a conceptual four-layer middleware architecture that decomposes management across workflow, workload, task, and resource levels, enabling application-aware scheduling over heterogeneous quantum-HPC resources; (ii) a set of execution motifs capturing interaction and coupling characteristics of hybrid applications, realized as quantum mini-apps for systematic workload characterization; (iii) Pilot-Quantum, a middleware framework built on the pilot abstraction that enables late binding and dynamic resource allocation, adapting to resource and workload dynamics at runtime; and (iv) Q-Dreamer, a performance modeling toolkit providing reusable components for informed workload partitioning, including a circuit-cutting optimizer that analytically derives optimal partitioning strategies. Evaluation on heterogeneous HPC platforms (Perlmutter, NVIDIA DGX with H100/B200 GPUs) demonstrates efficient multi-backend orchestration across CPUs, GPUs, and QPUs for diverse execution motifs. Q-Dreamer predicts optimal circuit cutting configurations with up to 82% accuracy.