Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing

πŸ“… 2025-05-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Small quadrupedal robots struggle to achieve real-time, robust locomotion over unstructured, uneven terrain under severe computational constraints. Method: This paper proposes an end-to-end control framework integrating real-time elevation mapping with deep reinforcement learning. It jointly trains a policy network and a state estimator, and incorporates a lightweight time-of-flight (ToF) sensor to enhance terrain perception robustness without visual-inertial odometry (VIO), thereby significantly reducing reliance on computationally intensive VIO. The method combines proximal policy optimization (PPO), multi-source odometry fusion (VIO/IMU/ToF), real-time elevation mapping, and low-cost depth-sensor calibration. Results: Experiments demonstrate 100% success rate in ascending 17.5 cm steps and 80% success at 22.5 cm; forward and yaw velocity tracking accuracies reach 1.0 m/s and 1.5 rad/s, respectively. The implementation is open-sourced.

Technology Category

Application Category

πŸ“ Abstract
Compact quadrupedal robots are proving increasingly suitable for deployment in real-world scenarios. Their smaller size fosters easy integration into human environments. Nevertheless, real-time locomotion on uneven terrains remains challenging, particularly due to the high computational demands of terrain perception. This paper presents a robust reinforcement learning-based exteroceptive locomotion controller for resource-constrained small-scale quadrupeds in challenging terrains, which exploits real-time elevation mapping, supported by a careful depth sensor selection. We concurrently train both a policy and a state estimator, which together provide an odometry source for elevation mapping, optionally fused with visual-inertial odometry (VIO). We demonstrate the importance of positioning an additional time-of-flight sensor for maintaining robustness even without VIO, thus having the potential to free up computational resources. We experimentally demonstrate that the proposed controller can flawlessly traverse steps up to 17.5 cm in height and achieve an 80% success rate on 22.5 cm steps, both with and without VIO. The proposed controller also achieves accurate forward and yaw velocity tracking of up to 1.0 m/s and 1.5 rad/s respectively. We open-source our training code at github.com/ETH-PBL/elmap-rl-controller.
Problem

Research questions and friction points this paper is trying to address.

Enabling real-time locomotion on uneven terrains for small quadrupeds
Reducing computational demands of terrain perception in robots
Improving robustness without visual-inertial odometry (VIO)
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning-based locomotion for quadrupeds
Real-time elevation mapping with depth sensors
Policy and state estimator trained concurrently
πŸ”Ž Similar Papers
No similar papers found.
Davide Plozza
Davide Plozza
ETH ZΓΌrich
Legged RoboticsAutonomous NavigationTinyML
P
Patricia Apostol
Center for Project-Based Learning, ETH Zurich, Zurich, Switzerland
P
Paul Joseph
Center for Project-Based Learning, ETH Zurich, Zurich, Switzerland
S
Simon Schlapfer
Center for Project-Based Learning, ETH Zurich, Zurich, Switzerland
Michele Magno
Michele Magno
ETH Zurich
Wireless sensor networksSmart Sensors and Internet of ThingsWake up RadioPower managementEnergy harvesters