🤖 AI Summary
To address safety and efficiency challenges at unsignalized intersections, this paper proposes a roadside unit (RSU)-centric cooperative autonomous driving framework. Methodologically, it introduces a novel RSU-centralized two-stage hybrid reinforcement learning architecture: Stage I performs offline pretraining by integrating conservative Q-learning (CQL) with behavior cloning (BC); Stage II fine-tunes the policy via online multi-agent proximal policy optimization (MAPPO) enhanced with self-attention mechanisms to decouple strong inter-vehicle dependencies, leveraging V2I communication and global perception for joint decision-making. Evaluated in CARLA simulations, the framework successfully coordinates complex three-vehicle interactions with a task failure rate below 0.03%, substantially outperforming Autoware. Moreover, it demonstrates robustness across varying vehicle counts and generalizes effectively to unseen intersection maps.
📝 Abstract
Unsignalized intersections pose significant safety and efficiency challenges due to complex traffic flows. This paper proposes a novel roadside unit (RSU)-centric cooperative driving system leveraging global perception and vehicle-to-infrastructure (V2I) communication. The core of the system is an RSU-based decision-making module using a two-stage hybrid reinforcement learning (RL) framework. At first, policies are pre-trained offline using conservative Q-learning (CQL) combined with behavior cloning (BC) on collected dataset. Subsequently, these policies are fine-tuned in the simulation using multi-agent proximal policy optimization (MAPPO), aligned with a self-attention mechanism to effectively solve inter-agent dependencies. RSUs perform real-time inference based on the trained models to realize vehicle control via V2I communications. Extensive experiments in CARLA environment demonstrate high effectiveness of the proposed system, by: extit{(i)} achieving failure rates below 0.03% in coordinating three connected and autonomous vehicles (CAVs) through complex intersection scenarios, significantly outperforming the traditional Autoware control method, and extit{(ii)} exhibiting strong robustness across varying numbers of controlled agents and shows promising generalization capabilities on other maps.