Parkour in the Wild: Learning a General and Extensible Agile Locomotion Policy Using Multi-expert Distillation and RL Fine-tuning

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Legged robots exhibit limited generalization and struggle to integrate and transfer locomotion skills across diverse, unstructured outdoor terrains. To address this, we propose a two-stage agile locomotion control framework combining multi-expert knowledge distillation with reinforcement learning fine-tuning. Our method uniquely integrates DAgger-based behavioral cloning distillation with PPO fine-tuning, enabling continuous adaptation to real-world 3D-scanned terrain geometries. It is the first to realize an end-to-end perception–locomotion policy directly mapping depth images to motor commands across heterogeneous terrains. Evaluated on the ANYmal D quadruped platform, our approach significantly improves agility and robustness in complex outdoor environments and markedly enhances cross-terrain skill integration. This work establishes a new state-of-the-art benchmark for quadrupedal locomotion control in unstructured野外 settings.

Technology Category

Application Category

📝 Abstract

Legged robots are well-suited for navigating terrains inaccessible to wheeled robots, making them ideal for applications in search and rescue or space exploration. However, current control methods often struggle to generalize across diverse, unstructured environments. This paper introduces a novel framework for agile locomotion of legged robots by combining multi-expert distillation with reinforcement learning (RL) fine-tuning to achieve robust generalization. Initially, terrain-specific expert policies are trained to develop specialized locomotion skills. These policies are then distilled into a unified foundation policy via the DAgger algorithm. The distilled policy is subsequently fine-tuned using RL on a broader terrain set, including real-world 3D scans. The framework allows further adaptation to new terrains through repeated fine-tuning. The proposed policy leverages depth images as exteroceptive inputs, enabling robust navigation across diverse, unstructured terrains. Experimental results demonstrate significant performance improvements over existing methods in synthesizing multi-terrain skills into a single controller. Deployment on the ANYmal D robot validates the policy's ability to navigate complex environments with agility and robustness, setting a new benchmark for legged robot locomotion.

Problem

Research questions and friction points this paper is trying to address.

Generalizing legged robot control across diverse unstructured terrains

Combining multi-expert distillation with RL for agile locomotion

Enabling robust navigation using depth images in real-world environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-expert distillation for unified locomotion policy

RL fine-tuning on diverse terrains for generalization

Depth images as inputs for robust navigation

🔎 Similar Papers

No similar papers found.

Authors to Follow