Scalability Assurance in SFC provisioning via Distributed Design for Deep Reinforcement Learning

📅 2024-12-28

📈 Citations: 0

✨ Influential: 0

career value

265K/year

🤖 AI Summary

Centralized deep reinforcement learning (DRL) controllers face scalability bottlenecks in service function chain (SFC) orchestration for large-scale networks, struggling to simultaneously achieve high throughput, low latency, and dynamic scalability. Method: We propose a hierarchical distributed DRL framework that partitions the network into dynamically adjustable clusters; each cluster hosts lightweight local DRL agents (based on PPO or DQN) for autonomous scheduling, while a global agent coordinates cross-cluster requests. The framework introduces a novel heterogeneous cluster adaptation mechanism supporting model migration, integrating NFV-aware resource allocation with end-to-end latency–aware scheduling. Contribution/Results: Experiments demonstrate up to a 60% improvement in request acceptance rate over centralized approaches, significant reduction in average end-to-end latency for accepted requests, and superior scalability, robustness, and deployment flexibility.

Technology Category

Application Category

📝 Abstract

High-quality Service Function Chaining (SFC) provisioning is provided by the timely execution of Virtual Network Functions (VNFs) in a defined sequence. Advanced Deep Reinforcement Learning (DRL) solutions are utilized in many studies to contribute to fast and reliable autonomous SFC provisioning. However, under a large-scale network environment, centralized solutions might struggle to provide efficient outcomes when handling massive demands with stringent End-to-End (E2E) delay constraints. Therefore, in this paper, a novel distributed SFC provisioning framework is proposed, where the network is divided into several clusters. Each cluster has a dedicated local agent with a DRL module to handle the SFC provisioning of demands in that cluster. Also, there is a general agent that can communicate with local agents to handle the requests beyond their capacity. The DRL module of local agents can be applied under different configurations of clusters independent of different numbers of data centers and logical links in each cluster. Simulation results demonstrate that utilizing the proposed distributed framework offers up to 60% improvements in the acceptance ratio of service requests in comparison to the centralized approach while minimizing the E2E delay of accepted requests.

Problem

Research questions and friction points this paper is trying to address.

Large-scale Network Services

Deep Reinforcement Learning (DRL)

Task Processing Efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed Deep Reinforcement Learning

Task Decomposition

Performance Enhancement

🔎 Similar Papers

No similar papers found.