🤖 AI Summary
To address degraded robustness and ride comfort in autonomous vehicle lateral control—caused by inaccurate vehicle modeling, sensor measurement noise, and module coupling—this paper proposes an MPC-PID-guided deep reinforcement learning (DRL) cooperative architecture. A safety-certified demonstrator controller, combining model predictive control (MPC) and PID, guides online policy learning of proximal policy optimization (PPO) or soft actor-critic (SAC) agents, unifying safety guarantees with environmental adaptability. The method is trained end-to-end within the CARLA simulator under partial state observability, significantly improving trajectory tracking accuracy and passenger comfort. DRL training stability increases by 40%, and convergence accelerates by a factor of 2.3. Crucially, this work introduces the first hierarchical, deterministic controller as a differentiable demonstrator embedded directly into the DRL training pipeline—establishing a novel paradigm for robust lateral control under model mismatch.
📝 Abstract
The controller is one of the most important modules in the autonomous driving pipeline, ensuring the vehicle reaches its desired position. In this work, a reinforcement learning based lateral control approach, despite the imperfections in the vehicle models due to measurement errors and simplifications, is presented. Our approach ensures comfortable, efficient, and robust control performance considering the interface between controlling and other modules. The controller consists of the conventional Model Predictive Control (MPC)-PID part as the basis and the demonstrator, and the Deep Reinforcement Learning (DRL) part which leverages the online information from the MPC-PID part. The controller's performance is evaluated in CARLA using the ground truth of the waypoints as inputs. Experimental results demonstrate the effectiveness of the controller when vehicle information is incomplete, and the training of DRL can be stabilized with the demonstration part. These findings highlight the potential to reduce development and integration efforts for autonomous driving pipelines in the future.