A Hybrid Reactive-Proactive Auto-scaling Algorithm for SLA-Constrained Edge Computing

📅 2025-12-16

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address response latency, inaccurate prediction, and complex configuration in auto-scaling microservices under stringent SLAs (e.g., low latency, high reliability) in edge computing, this paper proposes a hybrid elasticity mechanism integrating time-series forecasting with real-time feedback. We innovatively design a dual-mode framework that synergistically combines reactive and proactive scaling, and— for the first time—deeply embed a lightweight machine learning–based demand prediction model into the native Kubernetes control loop, enabling dynamic threshold adjustment and SLA-driven adaptive scaling. Evaluation on a real-world edge testbed demonstrates that our approach reduces SLA violation rate to 6%, a >70% improvement over state-of-the-art methods; it also significantly enhances end-to-end low-latency guarantees and system availability for multiple IoT applications.

Technology Category

Application Category

📝 Abstract

Edge computing decentralizes computing resources, allowing for novel applications in domains such as the Internet of Things (IoT) in healthcare and agriculture by reducing latency and improving performance. This decentralization is achieved through the implementation of microservice architectures, which require low latencies to meet stringent service level agreements (SLA) such as performance, reliability, and availability metrics. While cloud computing offers the large data storage and computation resources necessary to handle peak demands, a hybrid cloud and edge environment is required to ensure SLA compliance. This is achieved by sophisticated orchestration strategies such as Kubernetes, which help facilitate resource management. The orchestration strategies alone do not guarantee SLA adherence due to the inherent delay of scaling resources. Existing auto-scaling algorithms have been proposed to address these challenges, but they suffer from performance issues and configuration complexity. In this paper, a novel auto-scaling algorithm is proposed for SLA-constrained edge computing applications. This approach combines a Machine Learning (ML) based proactive auto-scaling algorithm, capable of predicting incoming resource requests to forecast demand, with a reactive autoscaler which considers current resource utilization and SLA constraints for immediate adjustments. The algorithm is integrated into Kubernetes as an extension, and its performance is evaluated through extensive experiments in an edge environment with real applications. The results demonstrate that existing solutions have an SLA violation rate of up to 23%, whereas the proposed hybrid solution outperforms the baselines with an SLA violation rate of only 6%, ensuring stable SLA compliance across various applications.

Problem

Research questions and friction points this paper is trying to address.

Develops a hybrid auto-scaling algorithm for SLA-constrained edge computing

Combines proactive ML prediction with reactive adjustments to reduce SLA violations

Integrates into Kubernetes to manage resources in hybrid cloud-edge environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid reactive-proactive auto-scaling algorithm for SLA compliance

Machine Learning predicts demand, reactive component adjusts resources

Integrated into Kubernetes, reduces SLA violations to 6%

🔎 Similar Papers

Online SLA Decomposition: Enabling Real-Time Adaptation to Evolving Network Systems