SIMPLE: Simulation-Based Policy Learning and Evaluation for Humanoid Loco-manipulation

📅 2026-06-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of scalable and reproducible simulation benchmarks for whole-body mobile manipulation in humanoid robots. We present the first unified simulation platform that supports large-scale tasks, diverse scenes, and extensive object assets, integrating MuJoCo’s high-fidelity dynamics with IsaacSim’s ray-traced rendering. The platform incorporates automated trajectory generation and low-latency VR-based teleoperation for real-world data collection. It enables, for the first time, visually realistic and contact-rich simulation of humanoid mobile manipulation, facilitating consistent evaluation of multiple state-of-the-art control policies. Experiments across 60 tasks and 50 scenes demonstrate strong sim-to-real correlation, with policies trained solely in simulation achieving zero-shot transfer to physical robots.
📝 Abstract
Humanoid foundation models are advancing faster than we can evaluate them. While real-world testing is expensive and difficult to reproduce, existing simulation benchmarks focus primarily on table-top or wheeled robots. A scalable and reproducible benchmark for whole-body humanoid loco-manipulation remains an open problem. To this end, we present SIMPLE, a unified simulation testbed for humanoid policy learning and evaluation. SIMPLE couples the accurate contact-rich dynamics of MuJoCo with the photorealistic rendering of IsaacSim. It provides a large-scale environment comprising 60 diverse whole-body tasks, 50 indoor scenes, and over 1,000 object assets. To facilitate scalable data collection, the framework integrates two data generation pipelines: automated trajectory generation via motion planning and a low-latency VR teleoperation interface. We further integrate and benchmark mainstream humanoid policies at scale in SIMPLE, including lightweight imitation networks, large vision-language-action (VLA) models, and recent world action models (WAMs). Our experiments reveal a strong correlation between policy performance in simulation and the real world. Furthermore, we demonstrate that policies trained on data collected in SIMPLE can be transferred zero-shot to physical humanoid robots under similar settings, providing a robust and reproducible foundation for humanoid robotics research.
Problem

Research questions and friction points this paper is trying to address.

humanoid
loco-manipulation
benchmark
simulation
evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

humanoid locomotion
simulation benchmark
policy transfer
VR teleoperation
vision-language-action models
🔎 Similar Papers
2024-07-16Neural Information Processing SystemsCitations: 16
Songlin Wei
Songlin Wei
University of Southern California, (Previously) Peking University
Robotics3D Vision
Z
Zhenhao Ni
USC Physical Superintelligence (PSI) Lab
J
Jie Liu
USC Physical Superintelligence (PSI) Lab
Z
Zhenyu Zhao
USC Physical Superintelligence (PSI) Lab
Junjie Ye
Junjie Ye
Ph.D. Student at University of Southern California
Computer VisionRobotics
H
Hongyi Jing
USC Physical Superintelligence (PSI) Lab
J
Junkai Xia
USC Physical Superintelligence (PSI) Lab
X
Xiawei Liu
USC Physical Superintelligence (PSI) Lab
M
Michael Leong
USC Physical Superintelligence (PSI) Lab
L
Liang Heng
USC Physical Superintelligence (PSI) Lab
D
Di Huang
USC Physical Superintelligence (PSI) Lab
Yue Wang
Yue Wang
USC
Computer VisionRobotics