🤖 AI Summary
To address delayed network state awareness, high policy correction costs, and lack of cross-domain semantic understanding in deep reinforcement learning (DRL)-driven service function chain (SFC) orchestration, this paper proposes a synergistic framework integrating DRL with a lightweight language model. Specifically, VNF deployment decisions generated by PPO or SAC are fed into LoRA-finetuned BERT or DistilBERT to enable real-time natural language reasoning for resource utilization analysis, bottleneck identification, and demand forecasting. This work is the first to unify structured DRL outputs with domain-adapted linguistic understanding, transcending conventional monitoring paradigms and enabling multi-granularity, interpretable, interactive network state interpretation. Experimental results show that fine-tuned BERT achieves a test loss of 0.28 (DistilBERT: 0.36), an inference confidence of 0.83, millisecond-scale QA response latency, and over 92% accuracy in resource bottleneck identification.
📝 Abstract
Efficient Service Function Chain (SFC) provisioning and Virtual Network Function (VNF) placement are critical for enhancing network performance in modern architectures such as Software-Defined Networking (SDN) and Network Function Virtualization (NFV). While Deep Reinforcement Learning (DRL) aids decision-making in dynamic network environments, its reliance on structured inputs and predefined rules limits adaptability in unforeseen scenarios. Additionally, incorrect actions by a DRL agent may require numerous training iterations to correct, potentially reinforcing suboptimal policies and degrading performance. This paper integrates DRL with Language Models (LMs), specifically Bidirectional Encoder Representations from Transformers (BERT) and DistilBERT, to enhance network management. By feeding final VNF allocations from DRL into the LM, the system can process and respond to queries related to SFCs, DCs, and VNFs, enabling real-time insights into resource utilization, bottleneck detection, and future demand planning. The LMs are fine-tuned to our domain-specific dataset using Low-Rank Adaptation (LoRA). Results show that BERT outperforms DistilBERT with a lower test loss (0.28 compared to 0.36) and higher confidence (0.83 compared to 0.74), though BERT requires approximately 46% more processing time.