Cooperative Long Rope Skipping via Multi-Agent Reinforcement Learning

📅 2026-06-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of adapting to diverse jumping rhythms in collaborative long-rope skipping tasks involving multiple humanoid robots. To this end, the authors propose a hierarchical reinforcement learning framework: at the lower level, a decentralized multi-agent reinforcement learning approach trains robust rope-swinging policies, while the upper level employs centralized scheduling to coordinate actions and integrates diverse jumping behavior models to enhance generalization. As the first study to apply multi-agent reinforcement learning to cooperative rope skipping with humanoids, the method demonstrates superior performance on the Unitree G1 platform, achieving significantly improved rope-swinging stability and adaptability to heterogeneous jumper behaviors in both simulation and real-world environments.

📝 Abstract

Humans exhibit remarkable motor agility, enabling a wide range of dynamic skills such as running and jumping, which highlights the great potential of humanoid robots for athletic locomotion. Among athletic sports, long rope skipping requires two rope turners to cooperatively swing the rope while adapting to a player under different jumping rhythms, making it a meaningful yet challenging task for humanoid robots. Although existing methods for humanoid sports have achieved success in single-agent and interaction-free settings, such as running, dancing, and parkour, task scenarios that require precise coordination among multiple participants remain largely unexplored. To this end, we propose Marope, a multi-agent reinforcement learning (MARL) framework for cooperative long rope skipping with multiple humanoid robots. Specifically, Marope adopts a hierarchical reinforcement learning framework for policy training. At the lower level, it learns decentralized rope manipulation policies through MARL, while at the upper level, a centralized scheduling policy is trained to coordinate the execution of the lower-level policies. To improve generalization across different player behavioral styles, Marope further incorporates diverse jumping policies into cooperative game training. We evaluate our approach on Unitree G1 humanoid robots in both simulation and real-world settings. Experimental results demonstrate that Marope outperforms various baselines, achieving more efficient and stable rope manipulation as well as more robust and adaptable cooperation with varied players.

Problem

Research questions and friction points this paper is trying to address.

cooperative long rope skipping

multi-agent coordination

humanoid robots

athletic locomotion

multi-participant dynamic task

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent reinforcement learning

hierarchical reinforcement learning

humanoid robot coordination