Accelerating and Scaling MPC-Guided Reinforcement Learning for Humanoid Locomotion and Manipulation

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

212K/year
🤖 AI Summary
This work addresses the challenges of high training overhead and complex problem formulation in combining model predictive control (MPC) with reinforcement learning (RL) for humanoid robot motion control. To this end, the authors propose an MPC-RL framework that leverages a centroidal dynamics-based MPC during training to generate guiding trajectories and designs an MPC-informed reward mechanism to enhance learning efficiency. Furthermore, they develop πⁿMPC, a just-in-time compiled, batched GPU-parallel MPC solver capable of handling time-varying dynamics, which substantially reduces computational costs. Experimental results demonstrate that the proposed approach outperforms existing methods across a range of locomotion and manipulation tasks, achieving both high performance and computational efficiency on real hardware.
📝 Abstract
In humanoid motion control, model predictive control (MPC) offers physically grounded prediction and constraint handling, while reinforcement learning (RL) enables robust whole-body skills through large-scale simulation. However, using MPC inside RL often requires time-consuming problem construction or excessive training overhead, making such frameworks difficult to justify in practice. This work studies efficient training-time MPC guidance for humanoid locomotion and manipulation, termed MPC-RL. We introduce a centroidal-dynamics MPC reward formulation that leverages guidance from MPC trajectories in training time. To make this practical in massively parallel RL, we develop $π^n$MPC, a parallel-in-horizon and construction-free batched GPU MPC solver that operates directly on time-varying dynamics to avoid high memory usage and pre-compilation. Through a variety of comparative studies and hardware validations, we have found that MPC-RL achieves superior performance in locomotion and manipulation skills. The code base is available at https://github.com/junhengl/mpc-rl.
Problem

Research questions and friction points this paper is trying to address.

humanoid locomotion
model predictive control
reinforcement learning
training overhead
motion control
Innovation

Methods, ideas, or system contributions that make the work stand out.

MPC-RL
centroidal dynamics
parallel GPU solver
construction-free MPC
humanoid locomotion
🔎 Similar Papers
2024-07-16Neural Information Processing SystemsCitations: 16