🤖 AI Summary
To address the dual bottlenecks of high-cost expert demonstrations and the limitation of existing generative control methods to quasi-static scenarios in fast dynamic robotic tasks, this paper proposes Generative Predictive Control (GPC). GPC employs flow matching to construct a generative policy model trained exclusively on simulated data—eliminating reliance on human teleoperation. It establishes, for the first time, a theoretical connection between sampling-based predictive control and generative modeling. Moreover, GPC supports online warm-start optimization, ensuring temporal consistency and millisecond-level feedback. Evaluated on high-speed, non-quasi-static tasks—including agile grasping and bouncing locomotion—GPC demonstrates real-time performance, stability, and strong generalization. This work introduces a scalable, demonstration-free paradigm for general-purpose robotic policy learning.
📝 Abstract
Generative control policies have recently unlocked major progress in robotics. These methods produce action sequences via diffusion or flow matching, with training data provided by demonstrations. But despite enjoying considerable success on difficult manipulation problems, generative policies come with two key limitations. First, behavior cloning requires expert demonstrations, which can be time-consuming and expensive to obtain. Second, existing methods are limited to relatively slow, quasi-static tasks. In this paper, we leverage a tight connection between sampling-based predictive control and generative modeling to address each of these issues. In particular, we introduce generative predictive control, a supervised learning framework for tasks with fast dynamics that are easy to simulate but difficult to demonstrate. We then show how trained flow-matching policies can be warm-started at run-time, maintaining temporal consistency and enabling fast feedback rates. We believe that generative predictive control offers a complementary approach to existing behavior cloning methods, and hope that it paves the way toward generalist policies that extend beyond quasi-static demonstration-oriented tasks.