🤖 AI Summary
Traditional resource allocation methods in heterogeneous wireless networks (HetNets) suffer from poor adaptability to dynamic user load and time-varying channel conditions. To address this, this paper proposes a deep reinforcement learning (DRL) framework that jointly optimizes transmit power, bandwidth allocation, and user scheduling. A multi-objective reward function is designed to simultaneously maximize throughput, energy efficiency, and fairness. The framework comparatively evaluates Proximal Policy Optimization (PPO) and Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithms under realistic base station deployments and benchmarks them against three classical heuristic approaches. Experimental results demonstrate that the proposed DRL framework significantly improves resource utilization efficiency and overall network performance across diverse dynamic scenarios. Furthermore, it reveals critical trade-offs among algorithm selection, reward function design, and environmental generalizability. This work provides a scalable, end-to-end solution for intelligent resource management in HetNets.
📝 Abstract
Dynamic resource allocation in heterogeneous wireless networks (HetNets) is challenging for traditional methods under varying user loads and channel conditions. We propose a deep reinforcement learning (DRL) framework that jointly optimises transmit power, bandwidth, and scheduling via a multi-objective reward balancing throughput, energy efficiency, and fairness. Using real base station coordinates, we compare Proximal Policy Optimisation (PPO) and Twin Delayed Deep Deterministic Policy Gradient (TD3) against three heuristic algorithms in multiple network scenarios. Our results show that DRL frameworks outperform heuristic algorithms in optimising resource allocation in dynamic networks. These findings highlight key trade-offs in DRL design for future HetNets.