🤖 AI Summary
To address degraded traffic signal control performance caused by unreliable and imperfect sensor detection in real-world scenarios, this paper proposes a vision-perception-driven multi-agent reinforcement learning (MARL) co-simulation framework. The framework integrates heterogeneous simulators CARLA and SUMO, leveraging roadside camera feeds and real-time vehicle detection via YOLOv5/v8 to establish a closed-loop, vision-feedback-based signal control architecture. It introduces a novel paradigm that directly couples noise-robust MARL with sparse visual perception, and systematically evaluates the practical efficacy differences among YOLO variants in closed-loop control. Experiments demonstrate an average 23.7% reduction in vehicle delay and a 19.2% improvement in throughput efficiency. Notably, the framework sustains ≥14.5% performance gain even when detection accuracy drops to 72%, significantly outperforming conventional fixed-time and fixed-detector-based approaches.
📝 Abstract
Traffic simulations are commonly used to optimize traffic flow, with reinforcement learning (RL) showing promising potential for automated traffic signal control. Multi-agent reinforcement learning (MARL) is particularly effective for learning control strategies for traffic lights in a network using iterative simulations. However, existing methods often assume perfect vehicle detection, which overlooks real-world limitations related to infrastructure availability and sensor reliability. This study proposes a co-simulation framework integrating CARLA and SUMO, which combines high-fidelity 3D modeling with large-scale traffic flow simulation. Cameras mounted on traffic light poles within the CARLA environment use a YOLO-based computer vision system to detect and count vehicles, providing real-time traffic data as input for adaptive signal control in SUMO. MARL agents, trained with four different reward structures, leverage this visual feedback to optimize signal timings and improve network-wide traffic flow. Experiments in the test-bed demonstrate the effectiveness of the proposed MARL approach in enhancing traffic conditions using real-time camera-based detection. The framework also evaluates the robustness of MARL under faulty or sparse sensing and compares the performance of YOLOv5 and YOLOv8 for vehicle detection. Results show that while better accuracy improves performance, MARL agents can still achieve significant improvements with imperfect detection, demonstrating adaptability for real-world scenarios.