AIRoA MoMa Dataset: A Large-Scale Hierarchical Dataset for Mobile Manipulation

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing robot datasets lack synchronized force-torque sensing, hierarchical task annotations, and explicit failure logging—hindering research on natural-language-driven, long-horizon, high-contact mobile manipulation. To address this, we introduce the first real-world multimodal robot dataset, collected using the Human Support Robot platform. It synchronously captures RGB video, joint states, 6-DOF wrist wrenches, and internal system states. Crucially, we propose a novel two-level annotation scheme—subgoals paired with primitive actions—and systematically log both successful and failed execution episodes. All data are standardized to the LeRobot v2.1 format. The released dataset comprises 25,469 segments (~94 hours), bridging critical gaps in contact-aware and hierarchically structured robotic data. This resource establishes a foundational benchmark for vision-language-action models, enabling significant improvements in task generalization and robustness under physical interaction.

Technology Category

Application Category

📝 Abstract
As robots transition from controlled settings to unstructured human environments, building generalist agents that can reliably follow natural language instructions remains a central challenge. Progress in robust mobile manipulation requires large-scale multimodal datasets that capture contact-rich and long-horizon tasks, yet existing resources lack synchronized force-torque sensing, hierarchical annotations, and explicit failure cases. We address this gap with the AIRoA MoMa Dataset, a large-scale real-world multimodal dataset for mobile manipulation. It includes synchronized RGB images, joint states, six-axis wrist force-torque signals, and internal robot states, together with a novel two-layer annotation schema of sub-goals and primitive actions for hierarchical learning and error analysis. The initial dataset comprises 25,469 episodes (approx. 94 hours) collected with the Human Support Robot (HSR) and is fully standardized in the LeRobot v2.1 format. By uniquely integrating mobile manipulation, contact-rich interaction, and long-horizon structure, AIRoA MoMa provides a critical benchmark for advancing the next generation of Vision-Language-Action models. The first version of our dataset is now available at https://huggingface.co/datasets/airoa-org/airoa-moma .
Problem

Research questions and friction points this paper is trying to address.

Developing robots that follow natural language instructions in unstructured environments
Addressing lack of multimodal datasets with force-torque sensing and hierarchical annotations
Providing benchmark for contact-rich mobile manipulation and long-horizon tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synchronized multimodal data with force-torque sensing
Two-layer hierarchical annotation schema for actions
Standardized mobile manipulation dataset for long-horizon tasks
🔎 Similar Papers
No similar papers found.
Ryosuke Takanami
Ryosuke Takanami
The University of Tokyo
roboticsrobot learning
P
Petr Khrapchenkov
The University of Tokyo
S
Shu Morikuni
The University of Tokyo
J
Jumpei Arima
AI Robot Association (AIRoA)
Y
Yuta Takaba
AI Robot Association (AIRoA)
S
Shunsuke Maeda
AI Robot Association (AIRoA)
T
Takuya Okubo
The University of Tokyo
G
Genki Sano
Telexistence, Inc.
S
Satoshi Sekioka
AI Robot Association (AIRoA)
A
Aoi Kadoya
AI Robot Association (AIRoA)
M
Motonari Kambara
The University of Tokyo
N
Naoya Nishiura
The University of Tokyo
H
Haruto Suzuki
The University of Tokyo
T
Takanori Yoshimoto
The University of Tokyo
K
Koya Sakamoto
The University of Tokyo
S
Shinnosuke Ono
The University of Tokyo
H
Hu Yang
The University of Tokyo
D
Daichi Yashima
The University of Tokyo
A
Aoi Horo
The University of Tokyo
Tomohiro Motoda
Tomohiro Motoda
National Institute of Advanced Industrial Science and Technology (AIST)
Robotic manipulationdeep learning
K
Kensuke Chiyoma
AI Robot Association (AIRoA)
H
Hiroshi Ito
Waseda University
K
Koki Fukuda
The University of Tokyo
A
Akihito Goto
AI Robot Association (AIRoA)
K
Kazumi Morinaga
National Institute of Advanced Industrial Science and Technology (AIST)