Deep Reinforcement Learning for Day-to-day Dynamic Tolling in Tradable Credit Schemes

📅 2025-04-10

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Traditional daily dynamic pricing in Tradable Credit Schemes (TCS) struggles to adapt to real-time supply-demand fluctuations and heterogeneous traveler behavior. Method: This work pioneers the integration of Deep Reinforcement Learning (DRL) into real-time road network pricing for TCS, formulating the problem as a discrete-time Markov Decision Process. We propose an adaptive policy framework based on Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC), incorporating L2 regularization and action smoothing to suppress policy oscillation, and leveraging transfer learning to enhance cross-scenario generalization. Contribution/Results: Experiments demonstrate that our approach matches Bayesian optimization benchmarks in travel time reduction and social welfare improvement; exhibits strong robustness to variations in network capacity and demand; and significantly reduces training cost and deployment complexity for large-scale networks—thereby advancing TCS from theoretical conception toward practical implementation.

Technology Category

Application Category

📝 Abstract

Tradable credit schemes (TCS) are an increasingly studied alternative to congestion pricing, given their revenue neutrality and ability to address issues of equity through the initial credit allocation. Modeling TCS to aid future design and implementation is associated with challenges involving user and market behaviors, demand-supply dynamics, and control mechanisms. In this paper, we focus on the latter and address the day-to-day dynamic tolling problem under TCS, which is formulated as a discrete-time Markov Decision Process and solved using reinforcement learning (RL) algorithms. Our results indicate that RL algorithms achieve travel times and social welfare comparable to the Bayesian optimization benchmark, with generalization across varying capacities and demand levels. We further assess the robustness of RL under different hyperparameters and apply regularization techniques to mitigate action oscillation, which generates practical tolling strategies that are transferable under day-to-day demand and supply variability. Finally, we discuss potential challenges such as scaling to large networks, and show how transfer learning can be leveraged to improve computational efficiency and facilitate the practical deployment of RL-based TCS solutions.

Problem

Research questions and friction points this paper is trying to address.

Dynamic tolling in tradable credit schemes using reinforcement learning

Addressing demand-supply variability and action oscillation in tolling strategies

Improving computational efficiency for large-scale network applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses reinforcement learning for dynamic tolling

Formulates tolling as Markov Decision Process

Applies regularization to reduce action oscillation

🔎 Similar Papers

Enhancing Deep Hedging of Options with Implied Volatility Surface Feedback Information