🤖 AI Summary
Legged robots exhibit limited generalization and struggle to integrate and transfer locomotion skills across diverse, unstructured outdoor terrains. To address this, we propose a two-stage agile locomotion control framework combining multi-expert knowledge distillation with reinforcement learning fine-tuning. Our method uniquely integrates DAgger-based behavioral cloning distillation with PPO fine-tuning, enabling continuous adaptation to real-world 3D-scanned terrain geometries. It is the first to realize an end-to-end perception–locomotion policy directly mapping depth images to motor commands across heterogeneous terrains. Evaluated on the ANYmal D quadruped platform, our approach significantly improves agility and robustness in complex outdoor environments and markedly enhances cross-terrain skill integration. This work establishes a new state-of-the-art benchmark for quadrupedal locomotion control in unstructured野外 settings.
📝 Abstract
Legged robots are well-suited for navigating terrains inaccessible to wheeled robots, making them ideal for applications in search and rescue or space exploration. However, current control methods often struggle to generalize across diverse, unstructured environments. This paper introduces a novel framework for agile locomotion of legged robots by combining multi-expert distillation with reinforcement learning (RL) fine-tuning to achieve robust generalization. Initially, terrain-specific expert policies are trained to develop specialized locomotion skills. These policies are then distilled into a unified foundation policy via the DAgger algorithm. The distilled policy is subsequently fine-tuned using RL on a broader terrain set, including real-world 3D scans. The framework allows further adaptation to new terrains through repeated fine-tuning. The proposed policy leverages depth images as exteroceptive inputs, enabling robust navigation across diverse, unstructured terrains. Experimental results demonstrate significant performance improvements over existing methods in synthesizing multi-terrain skills into a single controller. Deployment on the ANYmal D robot validates the policy's ability to navigate complex environments with agility and robustness, setting a new benchmark for legged robot locomotion.