A Method to Improve the Performance of Reinforcement Learning Based on the Y Operator for a Class of Stochastic Differential Equation-Based Child-Mother Systems

📅 2023-11-07
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Actor-Critic (AC) reinforcement learning methods exhibit limited control performance for stochastic dynamical systems modeled by stochastic differential equations (SDEs), particularly due to inadequate incorporation of inherent system stochasticity into the optimization objective. Method: This paper proposes the Y-operator-driven YORL framework, which explicitly embeds the drift and diffusion terms of the SDE into the Critic’s loss function—marking the first approach to internalize structural stochasticity directly into the optimization goal. Furthermore, it reformulates the solution of the Hamilton–Jacobi–Bellman (HJB) partial differential equation—governing the state-value function—as parallel, data-driven learning of SDE coefficients, thereby shifting the paradigm from PDE solving to SDE parameter estimation. Contribution/Results: Rigorous theoretical analysis guarantees convergence and optimality. Extensive numerical experiments on linear and nonlinear systems demonstrate that YORL consistently outperforms state-of-the-art RL methods in both model-based and model-free settings, achieving superior convergence speed, control accuracy, and robustness.
📝 Abstract
This paper introduces a novel operator, termed the Y operator, to elevate control performance in Actor-Critic(AC) based reinforcement learning for systems governed by stochastic differential equations(SDEs). The Y operator ingeniously integrates the stochasticity of a class of child-mother system into the Critic network's loss function, yielding substantial advancements in the control performance of RL algorithms.Additionally, the Y operator elegantly reformulates the challenge of solving partial differential equations for the state-value function into a parallel problem for the drift and diffusion functions within the system's SDEs.A rigorous mathematical proof confirms the operator's validity.This transformation enables the Y Operator-based Reinforcement Learning(YORL) framework to efficiently tackle optimal control problems in both model-based and data-driven systems.The superiority of YORL is demonstrated through linear and nonlinear numerical examples showing its enhanced performance over existing methods post convergence.
Problem

Research questions and friction points this paper is trying to address.

Improving reinforcement learning control for stochastic differential equation systems
Integrating system stochasticity into Critic network loss function
Reformulating PDE solutions into drift and diffusion function problems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Y operator for reinforcement learning control
Integrates system stochasticity into Critic loss function
Reformulates PDE solving into parallel drift-diffusion problems
🔎 Similar Papers
No similar papers found.
C
Cheng Yin
The School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, Hubei, China; State Key Laboratory of Intelligent Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, Hubei, China
Y
Yi Chen
The China-EU Institute for Clean and Renewable Energy, Huazhong University of Science and Technology, Wuhan, Hubei, China