DRL-Enabled Trajectory Planing for UAV-Assisted VLC: Optimal Altitude and Reward Design

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the three-dimensional trajectory planning problem in unmanned aerial vehicle (UAV)-assisted visible light communication systems by proposing an efficient adaptive path optimization method. The approach derives a closed-form analytical solution for the optimal flight altitude that satisfies a prescribed channel gain threshold and incorporates a pheromone-inspired reward mechanism to guide horizontal trajectory optimization via the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. By effectively integrating mixed-integer non-convex optimization with deep reinforcement learning, the method significantly enhances planning performance while maintaining high data collection efficiency. Experimental results demonstrate that, compared to baseline approaches, the proposed strategy reduces flight distance by up to 35% and shortens convergence steps by approximately 50%.

Technology Category

Application Category

📝 Abstract
Recently, the integration of unmanned aerial vehicle (UAV) and visible light communication (VLC) technologies has emerged as a promising solution to offer flexible communication and efficient lighting. This letter investigates the three-dimensional trajectory planning in a UAV-assisted VLC system, where a UAV is dispatched to collect data from ground users (GUs). The core objective is to develop a trajectory planning framework that minimizes UAV flight distance, which is equivalent to maximizing the data collection efficiency. This issue is formulated as a challenging mixed-integer non-convex optimization problem. To tackle it, we first derive a closed-form optimal flight altitude under specific VLC channel gain threshold. Subsequently, we optimize the UAV horizontal trajectory by integrating a novel pheromone-driven reward mechanism with the twin delayed deep deterministic policy gradient algorithm, which enables adaptive UAV motion strategy in complex environments. Simulation results validate that the derived optimal altitude effectively reduces the flight distance by up to 35% compared to baseline methods. Additionally, the proposed reward mechanism significantly shortens the convergence steps by approximately 50%, demonstrating notable efficiency gains in the context of UAV-assisted VLC data collection.
Problem

Research questions and friction points this paper is trying to address.

UAV trajectory planning
visible light communication
data collection efficiency
flight distance minimization
3D path optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

optimal altitude
pheromone-driven reward
trajectory planning
UAV-assisted VLC
deep reinforcement learning
🔎 Similar Papers
No similar papers found.
Tian Lin
Tian Lin
Google DeepMind
Machine LearningDeep LearningOnline LearningCombinatorial OptimizationSocial Networks
Yi Liu
Yi Liu
Department of Computer Science, City University of Hong Kong
Security and PrivacyFederated LearningAI Security
X
Xiao-Wei Tang
Department of Information and Communication Engineering, Tongji University, Shanghai, China
Y
Yunmei Shi
Department of Information and Communication Engineering, Tongji University, Shanghai, China
Y
Yi Huang
Department of Information and Communication Engineering, Tongji University, Shanghai, China
Z
Zhongxiang Wei
Department of Information and Communication Engineering, Tongji University, Shanghai, China
Q
Qingqing Wu
Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Yuhan Dong
Yuhan Dong
Associate Professor, Tsinghua Shenzhen International Graduate School
Optical wireless communicationsMachine learning and optimization