🤖 AI Summary
In geographically distributed, multi-organizational scientific computing environments, centralized task schedulers (e.g., Kubernetes) struggle with cross-domain collaboration, infrastructure dynamism, and workflow-platform coupling. Method: We propose a decentralized control plane that leverages semantic naming to enable automatic, location-agnostic binding between computational requests and Kubernetes endpoints—eliminating the need for pre-configuration or location awareness. Our approach integrates lightweight service discovery with cross-cluster resource orchestration to support dynamic, adaptive scheduling across organizational boundaries. Contribution/Results: Experiments demonstrate that, without a global controller, our system significantly improves scheduling flexibility and cross-cluster workflow portability. It establishes a novel distributed scheduling paradigm for scientific computing—characterized by high adaptability, low platform coupling, and inherent support for heterogeneous, evolving infrastructures.
📝 Abstract
Scientific communities are increasingly using geographically distributed computing platforms. The current methods of compute placement predominantly use logically centralized controllers such as Kubernetes (K8s) to match tasks to available resources. However, this centralized approach is unsuitable in multi-organizational collaborations. Furthermore, workflows often need to use manual configurations tailored for a single platform and cannot adapt to dynamic changes across infrastructure. Our work introduces a decentralized control plane for placing computations on geographically dispersed compute clusters using semantic names. We assign semantic names to computations to match requests with named Kubernetes (K8s) service endpoints. We show that this approach provides multiple benefits. First, it allows placement of computational jobs to be independent of location, enabling any cluster with sufficient resources to execute the computation. Second, it facilitates dynamic compute placement without requiring prior knowledge of cluster locations or predefined configurations.