🤖 AI Summary
To address the challenges of high dynamism and stringent SLO guarantees in Kubernetes-based serverless workloads, this paper proposes the first predictive autoscaling framework leveraging weakly supervised learning. Our method automatically clusters over 300,000 workload time windows using weak supervision—identifying, for the first time, four canonical patterns: periodic, bursty, ramp-up, and steady-noise—and integrates uncertainty quantification to enable perception-aware scaling decisions. The framework unifies time-series feature extraction, weakly supervised classification, and predictive modeling, and is natively embedded into the Kubernetes controller. Evaluated on real-world Azure Functions traces, it reduces SLO violations by 50%, shortens average response time by 40%, and incurs only a 2–8× increase in resource consumption during peak periods—demonstrating significant improvements in both performance and efficiency.
📝 Abstract
High-performance extreme computing (HPEC) platforms increasingly adopt serverless paradigms, yet face challenges in efficiently managing highly dynamic workloads while maintaining service-level objectives (SLOs). We propose **AAPA**, an archetype-aware predictive autoscaling system that leverages weak supervision to automatically classify 300,000,+ workload windows into four archetypes (PERIODIC, SPIKE, RAMP, STATIONARY_NOISY) with 99.8% accuracy. Evaluation on publicly available Azure Functions traces shows that AAPA reduces SLO violations by up to 50%, improves response time by 40%, albeit with a 2--8,$ imes$ increase in resource cost under spike-heavy loads.