Learning Predictive Control with Deep Koopman Operators for Autonomous Vehicle Motion Planning

📅 2026-06-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenges of low real-time planning efficiency and strong model dependency in autonomous driving under dynamic, complex environments caused by nonlinear, nonconvex optimization. To overcome these issues, the authors propose a novel framework integrating deep Koopman operators with learning-based predictive control. The approach leverages data-driven techniques to embed the nonlinear system into a linear observable space, combines a receding-horizon actor-critic reinforcement learning architecture to generate closed-loop feedback policies, and incorporates a potential-field-based mechanism for safety constraint enforcement. This is the first method to synergistically couple deep Koopman representations with closed-loop policy learning, yielding a learning-enabled predictive controller that preserves control-theoretic structure while embedding safety awareness. Simulations and real-world experiments on the Hongqi EHS3 platform demonstrate superior performance over baselines such as CBF-MPC and LMPCC in terms of safety, computational efficiency, and ride comfort.

📝 Abstract

Model Predictive Control (MPC) is widely used for autonomous-vehicle (AV) motion planning, but its real-time applicability is often limited by the need for accurate models and online solution of nonlinear, nonconvex optimization problems in dynamic road environments. Actor-critic reinforcement learning offers a promising alternative for online policy generation, yet its policy-learning process often lacks explicit control-theoretic structure. This article proposes a learning predictive control (LPC) framework with deep Koopman operators for efficient real-time motion planning under nonconvex constraints. To address nonlinear and uncertain vehicle dynamics, a deep-Koopman-based predictor is used to lift the system into an interpretable linear observable space in a data-driven manner. Unlike traditional MPC, which computes open-loop control sequences, the proposed LPC framework yields a closed-loop state-feedback policy within each prediction interval through receding-horizon actor-critic learning. To ensure safety under nonconvex environmental constraints, LPC constructs convex local surrogate representations of obstacles and defines corresponding potential-field functions. These functions and their gradients are directly embedded into the actor-critic structure, enabling efficient, safety-aware policy learning. Extensive simulations and real-world experiments on the HongQi-EHS3 platform demonstrate favorable performance in diverse obstacle-avoidance scenarios in terms of safety, computational efficiency, and driving comfort, compared with benchmark methods such as CBF-MPC and LMPCC.

Problem

Research questions and friction points this paper is trying to address.

Model Predictive Control

Autonomous Vehicle Motion Planning

Nonconvex Constraints

Real-time Control

Safety-aware Policy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Koopman Operators

Learning Predictive Control

Actor-Critic Reinforcement Learning