🤖 AI Summary
In the dynamic edge-cloud-air continuum—particularly involving satellites and mobile devices—client selection for federated learning faces challenges in simultaneously ensuring real-time responsiveness, accommodating GPU heterogeneity (e.g., NVIDIA vs. AMD), and meeting stringent service-level objective (SLO) guarantees. This paper proposes a lightweight, history- and monitoring-free client selection framework. It is the first to jointly integrate analytical modeling and probabilistic prediction to explicitly characterize the latency-energy-accuracy trade-offs and their uncertainties under GPU-accelerated training. The framework enables cross-architecture, multi-workload adaptive selection under user-specified SLO constraints. Experiments demonstrate that, compared to baseline methods, our approach improves SLO compliance by 13.77% on average and reduces computational resource waste by 72.5%.
📝 Abstract
Integration of edge, cloud and space devices into a unified 3D continuum imposes significant challenges for client selection in federated learning systems. Traditional approaches rely on continuous monitoring and historical data collection, which becomes impractical in dynamic environments where satellites and mobile devices frequently change operational conditions. Furthermore, existing solutions primarily consider CPU-based computation, failing to capture complex characteristics of GPU-accelerated training that is prevalent across the 3D continuum. This paper introduces ProbSelect, a novel approach utilizing analytical modeling and probabilistic forecasting for client selection on GPU-accelerated devices, without requiring historical data or continuous monitoring. We model client selection within user-defined SLOs. Extensive evaluation across diverse GPU architectures and workloads demonstrates that ProbSelect improves SLO compliance by 13.77% on average while achieving 72.5% computational waste reduction compared to baseline approaches.