🤖 AI Summary
This study addresses three key challenges in traffic signal control (TSC): poor generalization across road networks, limited interpretability, and difficulty in multi-intersection coordination. To this end, we propose a lightweight large language model (LLM) framework that integrates reinforcement learning with human-like reasoning. Methodologically, we employ expert-guided, self-iterative LLM training in simulation to develop a foundation model with zero-shot cross-network transferability, coupled with a synchronized communication network enabling real-time, coordinated decision-making across intersections. Our key contribution is the first integration of embodied, human-like reasoning into an edge-deployable LLM-based TSC architecture—achieving both high interpretability and real-time responsiveness. Deployment on a real-world urban network handling 55,000 vehicles demonstrates a 5.2% reduction in average queue length and a 50% decrease in traffic controller workload, significantly outperforming state-of-the-art baselines.
📝 Abstract
Traffic signal control (TSC) is vital for mitigating congestion and sustaining urban mobility. In this paper, we introduce Traffic-R1, a foundation model with human-like reasoning for TSC systems. Our model is developed through self-exploration and iteration of reinforced large language models (LLMs) with expert guidance in a simulated traffic environment. Compared to traditional reinforcement learning (RL) and recent LLM-based methods, Traffic-R1 offers three significant advantages. First, Traffic-R1 delivers zero-shot generalisation, transferring unchanged to new road networks and out-of-distribution incidents by utilizing its internal traffic control policies and human-like reasoning. Second, its 3B-parameter architecture is lightweight enough for real-time inference on mobile-class chips, enabling large-scale edge deployment. Third, Traffic-R1 provides an explainable TSC process and facilitates multi-intersection communication through its self-iteration and a new synchronous communication network. Extensive benchmarks demonstrate that Traffic-R1 sets a new state of the art, outperforming strong baselines and training-intensive RL controllers. In practice, the model now manages signals for more than 55,000 drivers daily, shortening average queues by over 5% and halving operator workload. Our checkpoint is available at https://huggingface.co/Season998/Traffic-R1.