UMI-Bench 1.0: An Open and Reproducible Real-World Benchmark for Tabletop Robotic Manipulation with UMI Data

📅 2026-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the absence of standardized, reproducible evaluation benchmarks for real-world robotic manipulation tailored to Universal Manipulation Interface (UMI)-style policies. To bridge this gap, we introduce the first physical-world evaluation platform specifically designed for UMI policies, which standardizes the entire pipeline—from data collection to deployment evaluation—through unified protocols for data acquisition, automated scene resetting, policy execution, and structured logging. Built upon the UMI data paradigm, our platform integrates wrist-mounted visual observations with canonical action representations, enabling open, auditable, and quantitative assessment of policy generalization and reliability in tabletop manipulation tasks.
📝 Abstract
Real-robot evaluation is essential for understanding whether learned manipulation policies can operate reliably outside curated demonstrations. This need is particularly pressing for Universal Manipulation Interface (UMI)-style policies, whose performance depends on the coupling between wrist-view observations, action representation, data collection, and physical deployment. Existing real-world benchmarks have made important progress, but they are not designed around this UMI data-to-deployment setting. We present UMI-Bench 1.0, a local-first real-robot benchmark for standardized evaluation of UMI-style manipulation policies. To the best of our knowledge, this is the first benchmark dedicated to real-world evaluation of UMI-based manipulation models. UMI-Bench aligns data collection, scene reset, policy execution, result logging, and task-factor analysis within a unified protocol. By making the full evaluation process reproducible and auditable, UMI-Bench provides a practical testbed for measuring how UMI-trained policies generalize to real physical manipulation.
Problem

Research questions and friction points this paper is trying to address.

UMI
robotic manipulation
real-world benchmark
policy evaluation
generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

UMI-Bench
real-world benchmark
robotic manipulation
reproducible evaluation
Universal Manipulation Interface
🔎 Similar Papers
No similar papers found.
S
Shi Jin
Soochow University
Y
Yuntian Wang
Soochow University
Y
Yuhui Duan
Soochow University
D
Di Wu
Soochow University
G
Gaoqi Dong
Lumos Robotics
X
Xiaohang Liu
Lumos Robotics
Xiaotong Li
Xiaotong Li
Peking University
Multimodal LLMFoundation ModelTransfer Learning
H
Hongfei Jia
Lumos Robotics
Z
Zehao Zhang
Lumos Robotics
Tianyu Wang
Tianyu Wang
Fudan University
Machine LearningOptimization
Z
Zhongjie Jia
Shanghai Jiao Tong University
Yuanqi Yao
Yuanqi Yao
INSAIT
RoboticsManipulation
Chenjia Bai
Chenjia Bai
Institute of Artificial Intelligence, China Telecom(中国电信人工智能研究院, TeleAI)
Reinforcement LearningRoboticsEmbodied AI
Z
Zhaxizhuoma
Shanghai Jiao Tong University
S
Siao Liu
Soochow University
Nieqing Cao
Nieqing Cao
Assistant Professor of Xi'an Jiaotong-Liverpool University
AI/ML in Smart ManufacturingRobotics
J
Jin Wang
Soochow University
C
Chao Yu
Lumos Robotics
Y
Yan Ding
Lumos Robotics