Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

πŸ“… 2026-06-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

200K/year
πŸ€– AI Summary
Existing approaches to humanoid motion struggle to achieve zero-shot generalization to unseen actions and tasks due to data scarcity and the trade-off between agility and generalization. This work proposes a GPT-style Transformer architecture grounded in causal attention mechanisms, introducing large-scale pretraining to humanoid control for the first time. By integrating multi-source motion capture and internally recorded data, the authors construct a unified retargeted action corpus comprising two billion frames for training. This approach overcomes critical scaling bottlenecks in both data and model architecture, significantly outperforming current methods on highly dynamic and complex motion tracking tasks while demonstrating strong zero-shot transfer capabilities.
πŸ“ Abstract
We introduce Humanoid-GPT, a GPT-style Transformer with causal attention trained on a billion-scale motion corpus for whole-body control. Unlike prior shallow MLP trackers constrained by scarce data and an agility-generalization trade-off, Humanoid-GPT is pre-trained on a 2B-frame retargeted corpus that unifies all major mocap datasets with large-scale in-house recordings. Scaling both data and model capacity yields a single generative Transformer that tracks highly dynamic behaviors while achieving unprecedented zero-shot generalization to unseen motions and control tasks. Extensive experiments and scaling analyses show that our model establishes a new performance frontier, demonstrating robust zero-shot generalization to unseen tasks while simultaneously tracking highly dynamic and complex motions.
Problem

Research questions and friction points this paper is trying to address.

zero-shot generalization
motion tracking
humanoid control
data scarcity
agility-generalization trade-off
Innovation

Methods, ideas, or system contributions that make the work stand out.

Humanoid-GPT
zero-shot generalization
motion tracking
Transformer
large-scale motion corpus
πŸ”Ž Similar Papers
No similar papers found.