Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids

📅 2024-09-17
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
Online solving of mixed-integer linear programming (MILP) problems—arising in hybrid logical dynamical systems such as microgrids and involving both discrete and continuous variables—suffers from the curse of dimensionality and combinatorial explosion. To address this, we propose a tightly integrated reinforcement learning–model predictive control (RL-MPC) framework. Its core innovation is a novel decoupled Q-function design that reformulates the online MILP optimization into a lower-dimensional linear or quadratic program (LP/QP) containing only continuous variables, drastically reducing computational complexity. The method synergistically combines deep reinforcement learning, hybrid logical dynamical modeling, and real-time optimization to implicitly learn discrete decisions while explicitly optimizing continuous control actions. Evaluated on real-data-driven microgrid simulations, the approach achieves substantial reductions in online computation time while maintaining an optimality gap below 2% and constraint feasibility exceeding 99%.

Technology Category

Application Category

📝 Abstract
This work proposes an approach that integrates reinforcement learning and model predictive control (MPC) to efficiently solve finite-horizon optimal control problems in mixed-logical dynamical systems. Optimization-based control of such systems with discrete and continuous decision variables entails the online solution of mixed-integer quadratic or linear programs, which suffer from the curse of dimensionality. Our approach aims at mitigating this issue by effectively decoupling the decision on the discrete variables and the decision on the continuous variables. Moreover, to mitigate the combinatorial growth in the number of possible actions due to the prediction horizon, we conceive the definition of decoupled Q-functions to make the learning problem more tractable. The use of reinforcement learning reduces the online optimization problem of the MPC controller from a mixed-integer linear (quadratic) program to a linear (quadratic) program, greatly reducing the computational time. Simulation experiments for a microgrid, based on real-world data, demonstrate that the proposed method significantly reduces the online computation time of the MPC approach and that it generates policies with small optimality gaps and high feasibility rates.
Problem

Research questions and friction points this paper is trying to address.

Efficiently solving finite-horizon optimal control in mixed-logical dynamical systems
Mitigating curse of dimensionality in mixed-integer linear programs
Reducing MPC computation time while maintaining feasibility and low suboptimality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines reinforcement learning with model predictive control
Decouples discrete and continuous variables for efficiency
Uses recurrent neural networks for Q-function approximation
🔎 Similar Papers
No similar papers found.
C
Caio Fabio Oliveira da Silva
Delft University of Technology, Melkweg 2, 2628 CD, Delft, The Netherlands
A
Azita Dabiri
Delft University of Technology, Melkweg 2, 2628 CD, Delft, The Netherlands
B
B. D. Schutter
Delft University of Technology, Melkweg 2, 2628 CD, Delft, The Netherlands