Robust and Generalized Humanoid Motion Tracking

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of dynamic contact failure in humanoid robots caused by closed-loop control amplifying local imperfections when executing noisy or inconsistent reference motions. To this end, the authors propose a dynamics-conditioned command aggregation framework that integrates a causal temporal encoder with a multi-head cross-attention mechanism to adaptively fuse contextual motion commands based on the current dynamical state. Robustness is further enhanced through a fall-recovery curriculum featuring randomized unstable initializations and annealed auxiliary lift forces. The model requires only approximately 3.5 hours of motion data for single-stage end-to-end training—without distillation—and demonstrates zero-shot generalization to unseen actions. Notably, it achieves high-fidelity sim-to-real transfer in highly dynamic contact scenarios.

Technology Category

Application Category

📝 Abstract
Learning a general humanoid whole-body controller is challenging because practical reference motions can exhibit noise and inconsistencies after being transferred to the robot domain, and local defects may be amplified by closed-loop execution, causing drift or failure in highly dynamic and contact-rich behaviors. We propose a dynamics-conditioned command aggregation framework that uses a causal temporal encoder to summarize recent proprioception and a multi-head cross-attention command encoder to selectively aggregate a context window based on the current dynamics. We further integrate a fall recovery curriculum with random unstable initialization and an annealed upward assistance force to improve robustness and disturbance rejection. The resulting policy requires only about 3.5 hours of motion data and supports single-stage end-to-end training without distillation. The proposed method is evaluated under diverse reference inputs and challenging motion regimes, demonstrating zero-shot transfer to unseen motions as well as robust sim-to-real transfer on a physical humanoid robot.
Problem

Research questions and friction points this paper is trying to address.

humanoid motion tracking
robust control
generalization
contact-rich dynamics
reference motion inconsistency
Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamics-conditioned command aggregation
causal temporal encoder
multi-head cross-attention
fall recovery curriculum
sim-to-real transfer
🔎 Similar Papers
No similar papers found.
Y
Yubiao Ma
Beijing Institute of Technology, Beijing, China.
Han Yu
Han Yu
Unknown affiliation
J
Jiayin Xie
Humanoid Robotics (Shanghai) Co., Ltd., Shanghai 201203, China.
C
Changtai Lv
Humanoid Robotics (Shanghai) Co., Ltd., Shanghai 201203, China.
Qiang Luo
Qiang Luo
Principal Investigator, ISTBI (类脑智能科学与技术研究院), Fudan University
Computational PsychiatryNeuroImageComplex Causal Models
C
Chi Zhang
Humanoid Robotics (Shanghai) Co., Ltd., Shanghai 201203, China.
Y
Yunpeng Yin
Humanoid Robotics (Shanghai) Co., Ltd., Shanghai 201203, China.
B
Boyang Xing
Humanoid Robotics (Shanghai) Co., Ltd., Shanghai 201203, China.
X
Xuemei Ren
Beijing Institute of Technology, Beijing, China.
D
Dongdong Zheng
Beijing Institute of Technology, Beijing, China.; Humanoid Robotics (Shanghai) Co., Ltd., Shanghai 201203, China.