Parkour in the Wild: Learning a General and Extensible Agile Locomotion Policy Using Multi-expert Distillation and RL Fine-tuning

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Legged robots exhibit limited generalization and struggle to integrate and transfer locomotion skills across diverse, unstructured outdoor terrains. To address this, we propose a two-stage agile locomotion control framework combining multi-expert knowledge distillation with reinforcement learning fine-tuning. Our method uniquely integrates DAgger-based behavioral cloning distillation with PPO fine-tuning, enabling continuous adaptation to real-world 3D-scanned terrain geometries. It is the first to realize an end-to-end perception–locomotion policy directly mapping depth images to motor commands across heterogeneous terrains. Evaluated on the ANYmal D quadruped platform, our approach significantly improves agility and robustness in complex outdoor environments and markedly enhances cross-terrain skill integration. This work establishes a new state-of-the-art benchmark for quadrupedal locomotion control in unstructured野外 settings.

Technology Category

Application Category

📝 Abstract
Legged robots are well-suited for navigating terrains inaccessible to wheeled robots, making them ideal for applications in search and rescue or space exploration. However, current control methods often struggle to generalize across diverse, unstructured environments. This paper introduces a novel framework for agile locomotion of legged robots by combining multi-expert distillation with reinforcement learning (RL) fine-tuning to achieve robust generalization. Initially, terrain-specific expert policies are trained to develop specialized locomotion skills. These policies are then distilled into a unified foundation policy via the DAgger algorithm. The distilled policy is subsequently fine-tuned using RL on a broader terrain set, including real-world 3D scans. The framework allows further adaptation to new terrains through repeated fine-tuning. The proposed policy leverages depth images as exteroceptive inputs, enabling robust navigation across diverse, unstructured terrains. Experimental results demonstrate significant performance improvements over existing methods in synthesizing multi-terrain skills into a single controller. Deployment on the ANYmal D robot validates the policy's ability to navigate complex environments with agility and robustness, setting a new benchmark for legged robot locomotion.
Problem

Research questions and friction points this paper is trying to address.

Generalizing legged robot control across diverse unstructured terrains
Combining multi-expert distillation with RL for agile locomotion
Enabling robust navigation using depth images in real-world environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-expert distillation for unified locomotion policy
RL fine-tuning on diverse terrains for generalization
Depth images as inputs for robust navigation
🔎 Similar Papers
No similar papers found.
N
N. Rudin
Robotic Systems Lab, ETH Zurich; NVIDIA Switzerland
Junzhe He
Junzhe He
ETH Zurich
Reinforcement LearningRobot Learning
J
Joshua Aurand
Robotic Systems Lab, ETH Zurich
Marco Hutter
Marco Hutter
Professor of Robotics, ETH Zurich
Legged RoboticsRoboticsControl