🤖 AI Summary
To address key challenges in real-world general-purpose robotic manipulation—namely limited data scale, poor generalization, and coarse-grained skill representations—this paper introduces AgiBot World, a large-scale robotic manipulation platform encompassing over one million trajectories, 217 diverse tasks, and five deployment scenarios. It supports fine-grained skill learning across grippers to dexterous hands and integrates tactile-visual multimodal sensing. The core contribution is Genie Operator-1 (GO-1), a universal policy leveraging implicit action representation for data-efficient learning. Complementary innovations include a standardized human-in-the-loop data collection pipeline, latent-space action modeling, and scalable hardware interfaces. We publicly release high-quality datasets and models to advance embodied intelligence democratization. Experiments demonstrate that the pre-trained GO-1 policy achieves a 30% average improvement over Open X-Embodiment baselines; on complex long-horizon tasks, it attains >60% success rate—surpassing RDT by 32%.
📝 Abstract
We explore how scalable robot data can address real-world challenges for generalized robotic manipulation. Introducing AgiBot World, a large-scale platform comprising over 1 million trajectories across 217 tasks in five deployment scenarios, we achieve an order-of-magnitude increase in data scale compared to existing datasets. Accelerated by a standardized collection pipeline with human-in-the-loop verification, AgiBot World guarantees high-quality and diverse data distribution. It is extensible from grippers to dexterous hands and visuo-tactile sensors for fine-grained skill acquisition. Building on top of data, we introduce Genie Operator-1 (GO-1), a novel generalist policy that leverages latent action representations to maximize data utilization, demonstrating predictable performance scaling with increased data volume. Policies pre-trained on our dataset achieve an average performance improvement of 30% over those trained on Open X-Embodiment, both in in-domain and out-of-distribution scenarios. GO-1 exhibits exceptional capability in real-world dexterous and long-horizon tasks, achieving over 60% success rate on complex tasks and outperforming prior RDT approach by 32%. By open-sourcing the dataset, tools, and models, we aim to democratize access to large-scale, high-quality robot data, advancing the pursuit of scalable and general-purpose intelligence.