Hierarchical Reinforcement Learning for the Dynamic VNE with Alternatives Problem

📅 2025-12-04

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

For the Virtual Network Embedding with Adaptive Topology (VNEAP) problem—where virtual networks exhibit dynamic topologies in time-varying environments—this paper proposes HRL-VNEAP, a hierarchical reinforcement learning framework. The high-level policy selects an optimal topology or rejects the request, while the low-level policy performs resource mapping. This work is the first to introduce hierarchical RL to VNEAP, effectively decoupling topology selection from resource allocation to enhance decision efficiency and long-term cumulative reward. The framework integrates state encoding, attention mechanisms, and multi-step reward shaping to support real-time embedding. Extensive experiments on realistic topologies demonstrate that HRL-VNEAP outperforms the strongest baseline by 20.7% in request acceptance ratio, 36.2% in total revenue, and 22.1% in revenue-to-cost ratio, closely approaching the MILP-optimal solution.

Technology Category

Application Category

📝 Abstract

Virtual Network Embedding (VNE) is a key enabler of network slicing, yet most formulations assume that each Virtual Network Request (VNR) has a fixed topology. Recently, VNE with Alternative topologies (VNEAP) was introduced to capture malleable VNRs, where each request can be instantiated using one of several functionally equivalent topologies that trade resources differently. While this flexibility enlarges the feasible space, it also introduces an additional decision layer, making dynamic embedding more challenging. This paper proposes HRL-VNEAP, a hierarchical reinforcement learning approach for VNEAP under dynamic arrivals. A high-level policy selects the most suitable alternative topology (or rejects the request), and a low-level policy embeds the chosen topology onto the substrate network. Experiments on realistic substrate topologies under multiple traffic loads show that naive exploitation strategies provide only modest gains, whereas HRL-VNEAP consistently achieves the best performance across all metrics. Compared to the strongest tested baselines, HRL-VNEAP improves acceptance ratio by up to extbf{20.7%}, total revenue by up to extbf{36.2%}, and revenue-over-cost by up to extbf{22.1%}. Finally, we benchmark against an MILP formulation on tractable instances to quantify the remaining gap to optimality and motivate future work on learning- and optimization-based VNEAP solutions.

Problem

Research questions and friction points this paper is trying to address.

Dynamic embedding of flexible virtual network requests with multiple topology alternatives.

Hierarchical decision-making for selecting and embedding suitable alternative topologies.

Improving acceptance ratio, revenue, and efficiency in virtual network embedding.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical reinforcement learning for dynamic VNEAP

High-level policy selects alternative topology or rejects request

Low-level policy embeds chosen topology onto substrate network

🔎 Similar Papers

Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations