Smart Transportation Without Neurons -- Fair Metro Network Expansion with Tabular Reinforcement Learning

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This study addresses the metro network expansion problem (MNEP) by jointly optimizing efficiency and social equity while satisfying travel demand. It proposes the first application of tabular reinforcement learning to this domain, modeling the task as a non-Markovian reward decision process and incorporating explicit fairness constraints. By avoiding reliance on deep neural networks, the approach significantly enhances interpretability and computational efficiency without compromising performance. Empirical evaluations on real-world networks in Xi’an and Amsterdam demonstrate that the proposed method achieves comparable solution quality to deep reinforcement learning baselines while reducing training iterations by 18-fold and carbon emissions by 12-fold, thereby offering an efficient, low-carbon, and interpretable optimization framework for urban transit planning.

📝 Abstract

We tackle the Metro Network Expansion Problem (MNEP), a subset of the Transport Network Design Problem (TNDP), which focuses on expanding metro systems to satisfy travel demand. Traditional methods rely on exact and heuristic approaches that require expert-defined constraints to reduce the search space. Recently, deep reinforcement learning (Deep RL) has emerged due to its effectiveness in complex sequential decision-making processes-it remains, however, computationally expensive, environmentally costly, and requires additional engineering to interpret. We show that MNEP problems are small enough to not require Deep RL methods. Reformulating the MNEP as a Non-Markovian Rewards Decision Process (NMRDP), we use tabular RL to achieve similar performance with significantly fewer training episodes, additionally offering greater interpretability. Additionally, we incorporate social equity criteria into the reward functions, focusing on efficiency and fairness, highlighting the versatility of our method. Evaluated in real-world settings-Xi'an and Amsterdam-our method reduces total episodes by a factor of 18 and total carbon emissions by a factor of 12 on average, while remaining competitive with Deep RL. This approach offers a replicable, modular, interpretable, and resource-efficient solution with potential applications to other combinatorial optimization problems.

Problem

Research questions and friction points this paper is trying to address.

Metro Network Expansion Problem

Transport Network Design Problem

social equity

fairness

combinatorial optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tabular Reinforcement Learning

Metro Network Expansion

Non-Markovian Rewards Decision Process