End-to-end Training of High-Dimensional Optimal Control with Implicit Hamiltonians via Jacobian-Free Backpropagation

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing optimal control methods struggle with high-dimensional implicit systems—such as spacecraft re-entry and bicycle dynamics—where the Hamiltonian cannot be explicitly formulated. Method: This paper proposes an end-to-end implicit deep learning framework that directly parameterizes the value function. It is the first to intrinsically embed the theoretical relationship between the implicit Hamiltonian system and the gradient of the value function into the neural network architecture. Coupled constraints are derived by unifying Pontryagin’s maximum principle with dynamic programming, and Jacobian-Free Backpropagation (JFB) is employed to avoid explicit Jacobian computation. Results: The method successfully trains stable, convergent feedback controllers on multiple high-dimensional implicit Hamiltonian systems. It significantly extends the applicability of value-function-based optimal control to complex physical systems governed by implicit dynamics, overcoming longstanding limitations in scalability and expressivity for such problems.

Technology Category

Application Category

📝 Abstract
Neural network approaches that parameterize value functions have succeeded in approximating high-dimensional optimal feedback controllers when the Hamiltonian admits explicit formulas. However, many practical problems, such as the space shuttle reentry problem and bicycle dynamics, among others, may involve implicit Hamiltonians that do not admit explicit formulas, limiting the applicability of existing methods. Rather than directly parameterizing controls, which does not leverage the Hamiltonian's underlying structure, we propose an end-to-end implicit deep learning approach that directly parameterizes the value function to learn optimal control laws. Our method enforces physical principles by ensuring trained networks adhere to the control laws by exploiting the fundamental relationship between the optimal control and the value function's gradient; this is a direct consequence of the connection between Pontryagin's Maximum Principle and dynamic programming. Using Jacobian-Free Backpropagation (JFB), we achieve efficient training despite temporal coupling in trajectory optimization. We show that JFB produces descent directions for the optimal control objective and experimentally demonstrate that our approach effectively learns high-dimensional feedback controllers across multiple scenarios involving implicit Hamiltonians, which existing methods cannot address.
Problem

Research questions and friction points this paper is trying to address.

Solving optimal control with implicit Hamiltonians lacking explicit formulas
Learning high-dimensional feedback controllers via value function parameterization
Enforcing physical principles through Pontryagin's Maximum Principle connection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit deep learning for value function parameterization
Enforcing physical principles via optimal control laws
Jacobian-Free Backpropagation enables efficient trajectory optimization
🔎 Similar Papers
No similar papers found.