🤖 AI Summary
Existing reinforcement learning (RL) methods lack formal safety guarantees for nonlinear engineering systems such as autonomous vehicles and soft robotics, while model predictive control (MPC) suffers from trade-offs between model fidelity and real-time feasibility. This paper proposes an MPC-RL co-design framework: during training, an MPC-based online safety envelope guides RL policy learning; during deployment, a lightweight safety filter—grounded in Lipschitz continuity analysis—enforces dynamic constraints strictly without online optimization, ensuring real-time compliance. The approach integrates model predictive control, deep RL, and Lipschitz robustness analysis. Evaluated on a nonlinear aeroelastic wing testbed, the method achieves 32% improvement in disturbance rejection, 27% reduction in actuator energy consumption, and zero constraint violations with stable trajectory tracking under severe turbulence.
📝 Abstract
Modern engineering systems, such as autonomous vehicles, flexible robotics, and intelligent aerospace platforms, require controllers that are robust to uncertainties, adaptive to environmental changes, and safety-aware under real-time constraints. RL offers powerful data-driven adaptability for systems with nonlinear dynamics that interact with uncertain environments. RL, however, lacks built-in mechanisms for dynamic constraint satisfaction during exploration. MPC offers structured constraint handling and robustness, but its reliance on accurate models and computationally demanding online optimization may pose significant challenges. This paper proposes an integrated MPC-RL framework that combines stability and safety guarantees of MPC with the adaptability of RL. During training, MPC defines safe control bounds that guide the RL component and that enable constraint-aware policy learning. At deployment, the learned policy operates in real time with a lightweight safety filter based on Lipschitz continuity to ensure constraint satisfaction without heavy online optimizations. The approach, which is validated on a nonlinear aeroelastic wing system, demonstrates improved disturbance rejection, reduced actuator effort, and robust performance under turbulence. The architecture generalizes to other domains with structured nonlinearities and bounded disturbances, offering a scalable solution for safe artificial-intelligence-driven control in engineering applications.