🤖 AI Summary
This work addresses the challenge of achieving both natural gait and stable adaptation in humanoid robots across diverse terrains, which is primarily hindered by gradient disturbances and visual-dynamics distribution shifts. The authors propose CoRe-MoE, a two-stage reinforcement learning framework: first, a base locomotion policy is learned to enable smooth walking–running transitions; then, a contrastive learning–driven Mixture-of-Experts (MoE) module is introduced to perceive terrain characteristics and generate adaptive actions through weighted fusion. The key innovation lies in decoupling gait generation from terrain adaptation and leveraging contrastive learning to refine MoE gating, thereby promoting expert specialization. Evaluated in simulation, the method significantly outperforms baselines and demonstrates zero-shot transfer to the Unitree G1 robot, achieving robust walking and running on complex terrains—including stairs, slopes, and obstacles—with precise foot placement and strong disturbance rejection.
📝 Abstract
Humans primarily rely on walking and running to traverse complex terrains, without resorting to unnecessarily complex motion patterns. Similarly, humanoid robots should achieve smooth transitions between walking and running while maintaining natural and stable locomotion. However, unifying gait transition and multi-terrain adaptation within a single policy remains challenging due to gradient interference and the distribution shift induced by terrain-dependent visual and dynamic variations. Although Mixture-of-Experts (MoE) architectures can alleviate multi-skill interference, naive joint training often fails to yield clear expert specialization, limiting their effectiveness. To address these challenges, we propose CoRe-MoE, a two-stage reinforcement learning framework that decouples gait generation from terrain adaptation. In the first stage, a stable locomotion policy is learned to produce natural walking and running behaviors with smooth transitions. In the second stage, a terrain-aware MoE branch is introduced and trained with a contrastive objective to shape the gating network, enabling it to capture structured terrain representations and promote expert specialization. The final action is obtained via weighted fusion of the base gait policy and the terrain-aware branch, allowing the policy to preserve stable locomotion patterns while adapting to complex terrains. Extensive simulation results demonstrate that the proposed method outperforms baseline approaches in terms of success rate, locomotion stability, and multi-terrain adaptability. Furthermore, zero-shot deployment on a Unitree G1 humanoid robot validates the effectiveness of our framework, achieving robust walking and running across stairs, slopes, steps, obstacles, and unstructured outdoor terrains, while maintaining accurate foothold placement and dynamic stability under external disturbances.