🤖 AI Summary
Cloud-native microservice systems face challenges including frequent inter-service communication, complex heterogeneous resource management, and prohibitively high costs for large-scale real-system validation—hindering algorithm evaluation and system optimization research. To address these, we propose the first lightweight simulation framework that jointly models service topology, resource heterogeneity, and elastic scaling policies. Our approach enables dynamic microservice modeling, pluggable scheduling interfaces, and QoS-aware closed-loop simulation. Built upon an event-driven engine, it integrates containerized resource abstraction and multi-dimensional QoS monitoring, achieving high-concurrency simulation on commodity hardware with >94.5% accuracy in response time prediction. Empirical evaluations demonstrate the framework’s effectiveness in quantifying performance impacts of diverse auto-scaling strategies, significantly lowering the barrier to development and evaluation of industrial-grade cloud-native systems.
📝 Abstract
Cloud-native applications are increasingly becoming popular in modern software design. Employing a microservice-based architecture into these applications is a prevalent strategy that enhances system availability and flexibility. However, cloud-native applications also introduce new challenges, such as frequent inter-service communication and the complexity of managing heterogeneous codebases and hardware, resulting in unpredictable complexity and dynamism. Furthermore, as applications scale, only limited research teams or enterprises possess the resources for large-scale deployment and testing, which impedes progress in the cloud-native domain. To address these challenges, we propose CloudNativeSim, a simulator for cloud-native applications with a microservice-based architecture. CloudNativeSim offers several key benefits: (i) comprehensive and dynamic modeling for cloud-native applications, (ii) an extended simulation framework with new policy interfaces for scheduling cloud-native applications, and (iii) support for customized application scenarios and user feedback based on Quality of Service (QoS) metrics. CloudNativeSim can be easily deployed on standard computers to manage a high volume of requests and services. Its performance was validated through a case study, demonstrating higher than 94.5% accuracy in terms of response time. The study further highlights the feasibility of CloudNativeSim by illustrating the effects of various scaling policies.