🤖 AI Summary
This work addresses the poor stability and non-robust trajectory generation in high-speed agile locomotion of quadrupedal and bipedal robots. We propose the History-Aware Curriculum Learning (HACL) framework, the first to incorporate RNN-based hidden-state modeling into reinforcement learning—explicitly encoding temporal dependencies among joint velocity commands and linear/angular velocity rewards. This enables dynamic, adaptive curriculum scheduling and co-training across simulation and real hardware. Evaluated on MIT Mini Cheetah and Unitree Go1/Go2 platforms, Go1 achieves a peak forward speed of 6.7 m/s (vs. 7 m/s command), outperforming state-of-the-art methods by ~20%. Both simulation and physical experiments demonstrate strong generalization. Our core contributions are: (i) a time-aware historical modeling mechanism that captures critical temporal dynamics in locomotion control, and (ii) the HACL training paradigm, which significantly improves stability and efficiency in high-speed robotic locomotion.
📝 Abstract
We address the problem of agile and rapid locomotion, a key characteristic of quadrupedal and bipedal robots. We present a new algorithm that maintains stability and generates high-speed trajectories by considering the temporal aspect of locomotion. Our formulation takes into account past information based on a novel history-aware curriculum Learning (HACL) algorithm. We model the history of joint velocity commands with respect to the observed linear and angular rewards using a recurrent neural net (RNN). The hidden state helps the curriculum learn the relationship between the forward linear velocity and angular velocity commands and the rewards over a given time-step. We validate our approach on the MIT Mini Cheetah,Unitree Go1, and Go2 robots in a simulated environment and on a Unitree Go1 robot in real-world scenarios. In practice, HACL achieves peak forward velocity of 6.7 m/s for a given command velocity of 7m/s and outperforms prior locomotion algorithms by nearly 20%.