🤖 AI Summary
Dynamic job scheduling for fault-tolerant quantum cloud services faces critical challenges—including unpredictable quantum program submissions, severe QPU resource fragmentation, and stringent sub-millisecond response requirements. Method: This work proposes the first online, real-time scheduling framework based on polycube approximation. It integrates lattice-surgery-aware resource modeling, geometric approximation of polycubes to cuboids, dynamic resource allocation, and online memory defragmentation. Contribution/Results: The framework achieves ultra-low-latency scheduling (<10 ms) while significantly improving system throughput (+32.7%) and average QPU utilization (+41.5%) over conventional approaches. It is the first to jointly optimize high responsiveness and high resource efficiency—establishing a foundational scheduling mechanism for scalable, fault-tolerant quantum cloud computing.
📝 Abstract
Fault-tolerant quantum computers are expected to be offered as cloud services due to their significant resource and infrastructure requirements. Quantum multiprogramming, which runs multiple quantum jobs in parallel, is a promising approach to maximize the utilization of such systems. A key challenge in this setting is the need for an online scheduler capable of handling jobs submitted dynamically while other programs are already running. In this study, we formulate the online job scheduling problem for fault-tolerant quantum computing systems based on lattice surgery and propose an efficient scheduler to address it. To meet the responsiveness required in an online environment, our scheduler approximates lattice surgery programs, originally represented as polycubes, by using simpler cuboid representations. This approximation enables efficient scheduling while improving overall throughput. In addition, we incorporate a defragmentation mechanism into the scheduling process, demonstrating that it can further enhance QPU utilization.