Atomic Proximal Policy Optimization for Electric Robo-Taxi Dispatch and Charger Allocation

📅 2025-02-19

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This paper addresses the joint optimization of passenger matching, vehicle rebalancing, and charging allocation for large-scale electric autonomous taxi (robo-taxi) fleets operating under stochastic demand. Formulated as an average-reward infinite-horizon Markov decision process (MDP), the problem suffers from exponential growth of state and action spaces with fleet size. To tackle this, we propose an atomic action decomposition mechanism that drastically reduces the policy search space. Furthermore, we adapt the Proximal Policy Optimization (PPO) algorithm—first time for coordinated fleet-wide decision-making and dynamic charging resource allocation. Extensive simulations on real-world New York City ride-hailing data demonstrate that our method achieves long-term average revenue significantly closer to the fluid upper bound than baselines. Quantitative analysis reveals critical trade-offs: vehicle battery range and charger power capacity exert substantial influence on system throughput and deadheading rate.

Technology Category

Application Category

📝 Abstract

Pioneering companies such as Waymo have deployed robo-taxi services in several U.S. cities. These robo-taxis are electric vehicles, and their operations require the joint optimization of ride matching, vehicle repositioning, and charging scheduling in a stochastic environment. We model the operations of the ride-hailing system with robo-taxis as a discrete-time, average reward Markov Decision Process with infinite horizon. As the fleet size grows, the dispatching is challenging as the set of system state and the fleet dispatching action set grow exponentially with the number of vehicles. To address this, we introduce a scalable deep reinforcement learning algorithm, called Atomic Proximal Policy Optimization (Atomic-PPO), that reduces the action space using atomic action decomposition. We evaluate our algorithm using real-world NYC for-hire vehicle data and we measure the performance using the long-run average reward achieved by the dispatching policy relative to a fluid-based reward upper bound. Our experiments demonstrate the superior performance of our Atomic-PPO compared to benchmarks. Furthermore, we conduct extensive numerical experiments to analyze the efficient allocation of charging facilities and assess the impact of vehicle range and charger speed on fleet performance.

Problem

Research questions and friction points this paper is trying to address.

Optimize electric robo-taxi operations

Develop scalable reinforcement learning algorithm

Analyze charging facility allocation efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Atomic Proximal Policy Optimization

Electric Robo-Taxi Dispatch

Charger Allocation Analysis

🔎 Similar Papers

Multi-Agent Soft Actor-Critic with Global Loss for Autonomous Mobility-on-Demand Fleet Control