Accelerating and Scaling MPC-Guided Reinforcement Learning for Humanoid Locomotion and Manipulation

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the challenges of high training overhead and complex problem formulation in combining model predictive control (MPC) with reinforcement learning (RL) for humanoid robot motion control. To this end, the authors propose an MPC-RL framework that leverages a centroidal dynamics-based MPC during training to generate guiding trajectories and designs an MPC-informed reward mechanism to enhance learning efficiency. Furthermore, they develop πⁿMPC, a just-in-time compiled, batched GPU-parallel MPC solver capable of handling time-varying dynamics, which substantially reduces computational costs. Experimental results demonstrate that the proposed approach outperforms existing methods across a range of locomotion and manipulation tasks, achieving both high performance and computational efficiency on real hardware.

📝 Abstract

In humanoid motion control, model predictive control (MPC) offers physically grounded prediction and constraint handling, while reinforcement learning (RL) enables robust whole-body skills through large-scale simulation. However, using MPC inside RL often requires time-consuming problem construction or excessive training overhead, making such frameworks difficult to justify in practice. This work studies efficient training-time MPC guidance for humanoid locomotion and manipulation, termed MPC-RL. We introduce a centroidal-dynamics MPC reward formulation that leverages guidance from MPC trajectories in training time. To make this practical in massively parallel RL, we develop $π^n$MPC, a parallel-in-horizon and construction-free batched GPU MPC solver that operates directly on time-varying dynamics to avoid high memory usage and pre-compilation. Through a variety of comparative studies and hardware validations, we have found that MPC-RL achieves superior performance in locomotion and manipulation skills. The code base is available at https://github.com/junhengl/mpc-rl.

Problem

Research questions and friction points this paper is trying to address.

humanoid locomotion

model predictive control

reinforcement learning

training overhead

motion control

Innovation

Methods, ideas, or system contributions that make the work stand out.

MPC-RL

centroidal dynamics

parallel GPU solver

construction-free MPC

humanoid locomotion

🔎 Similar Papers

Omnigrasp: Grasping Diverse Objects with Simulated Humanoids

2024-07-16Neural Information Processing SystemsCitations: 16