Intelligent Optimization of Wireless Access Point Deployment for Communication-Based Train Control Systems Using Deep Reinforcement Learning

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of high measurement costs, low modeling accuracy, and poor optimization efficiency in tunnel access point (AP) deployment for urban rail transit’s Communications-Based Train Control (CBTC) systems, this paper proposes an intelligent optimization framework integrating physics-informed modeling with deep reinforcement learning. Specifically, it synergistically combines parabolic wave equation (PWE)-based channel modeling with conditional generative adversarial network (cGAN)-enabled data augmentation to construct a high-fidelity simulation environment. Building upon this, a Dueling Deep Q-Network (Dueling DQN)-based deployment policy learning algorithm is designed to balance physical interpretability and search efficiency. Experimental results demonstrate that the proposed method significantly outperforms conventional empirical optimization and purely data-driven approaches in terms of average received signal power, worst-case coverage performance, and computational overhead—yielding a highly reliable, cost-effective, and scalable AP deployment solution.

Technology Category

Application Category

📝 Abstract
Urban railway systems increasingly rely on communication based train control (CBTC) systems, where optimal deployment of access points (APs) in tunnels is critical for robust wireless coverage. Traditional methods, such as empirical model-based optimization algorithms, are hindered by excessive measurement requirements and suboptimal solutions, while machine learning (ML) approaches often struggle with complex tunnel environments. This paper proposes a deep reinforcement learning (DRL) driven framework that integrates parabolic wave equation (PWE) channel modeling, conditional generative adversarial network (cGAN) based data augmentation, and a dueling deep Q network (Dueling DQN) for AP placement optimization. The PWE method generates high-fidelity path loss distributions for a subset of AP positions, which are then expanded by the cGAN to create high resolution path loss maps for all candidate positions, significantly reducing simulation costs while maintaining physical accuracy. In the DRL framework, the state space captures AP positions and coverage, the action space defines AP adjustments, and the reward function encourages signal improvement while penalizing deployment costs. The dueling DQN enhances convergence speed and exploration exploitation balance, increasing the likelihood of reaching optimal configurations. Comparative experiments show that the proposed method outperforms a conventional Hooke Jeeves optimizer and traditional DQN, delivering AP configurations with higher average received power, better worst-case coverage, and improved computational efficiency. This work integrates high-fidelity electromagnetic simulation, generative modeling, and AI-driven optimization, offering a scalable and data-efficient solution for next-generation CBTC systems in complex tunnel environments.
Problem

Research questions and friction points this paper is trying to address.

Optimizing wireless AP deployment in CBTC tunnel systems
Overcoming limitations of traditional AP placement methods
Enhancing coverage and efficiency using deep reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep reinforcement learning optimizes wireless access point placement
Conditional GAN generates high-resolution path loss maps efficiently
Dueling DQN enhances convergence speed and solution quality
🔎 Similar Papers
No similar papers found.
K
Kunyu Wu
School of Electronics and Information Engineering, Sichuan University, Chengdu, 610017, China
Q
Qiushi Zhao
School of Electronics and Information Engineering, Sichuan University, Chengdu, 610017, China
Z
Zihan Feng
School of Electronics and Information Engineering, Sichuan University, Chengdu, 610017, China
Y
Yunxi Mu
College of Engineering, Peking University, Beijing, China
Hao Qin
Hao Qin
University of Arizona
machine learning theorybanditMarkov decision processReinforcement Learning
X
Xinyu Zhang
School of Electrical and Electronic Engineering, University College Dublin, Ireland
X
Xingqi Zhang
Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 1H9, Canada, and also with the School of Electrical and Electronic Engineering, University College Dublin, Ireland