ArticuBot: Learning Universal Articulated Object Manipulation Policy via Large Scale Simulation

📅 2025-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of zero-shot robotic manipulation of unseen articulated objects—such as doors, drawers, and cabinets—with diverse geometries, sizes, and articulation types in real-world settings using a single policy. We propose a hierarchical neural policy: a high-level module processes point-cloud inputs to predict structure-aware subgoals, while a low-level module executes precise motion control; we further introduce weighted displacement modeling to enhance action robustness. Leveraging 42.3k high-fidelity demonstrations generated from large-scale physics simulation, we integrate point-cloud encoding, behavioral cloning, and hierarchical reinforcement learning to enable efficient sim-to-real transfer. Our method successfully opens dozens of novel articulated objects on a Franka Emika arm (across two laboratories) and an X-Arm mobile platform (in laboratory, living room, and kitchen environments), achieving, for the first time, zero-shot generalization across object morphologies, environmental contexts, and robotic hardware platforms.

Technology Category

Application Category

📝 Abstract
This paper presents ArticuBot, in which a single learned policy enables a robotics system to open diverse categories of unseen articulated objects in the real world. This task has long been challenging for robotics due to the large variations in the geometry, size, and articulation types of such objects. Our system, Articubot, consists of three parts: generating a large number of demonstrations in physics-based simulation, distilling all generated demonstrations into a point cloud-based neural policy via imitation learning, and performing zero-shot sim2real transfer to real robotics systems. Utilizing sampling-based grasping and motion planning, our demonstration generalization pipeline is fast and effective, generating a total of 42.3k demonstrations over 322 training articulated objects. For policy learning, we propose a novel hierarchical policy representation, in which the high-level policy learns the sub-goal for the end-effector, and the low-level policy learns how to move the end-effector conditioned on the predicted goal. We demonstrate that this hierarchical approach achieves much better object-level generalization compared to the non-hierarchical version. We further propose a novel weighted displacement model for the high-level policy that grounds the prediction into the existing 3D structure of the scene, outperforming alternative policy representations. We show that our learned policy can zero-shot transfer to three different real robot settings: a fixed table-top Franka arm across two different labs, and an X-Arm on a mobile base, opening multiple unseen articulated objects across two labs, real lounges, and kitchens. Videos and code can be found on our project website: https://articubot.github.io/.
Problem

Research questions and friction points this paper is trying to address.

Develops a universal policy for robotic manipulation of diverse articulated objects.
Addresses challenges in geometry, size, and articulation type variations.
Enables zero-shot transfer from simulation to real-world robotic systems.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale simulation for diverse object manipulation
Hierarchical policy representation for better generalization
Zero-shot sim2real transfer to multiple robot settings
🔎 Similar Papers
No similar papers found.
Y
Yufei Wang
Robotics Institute, Carnegie Mellon University
Z
Ziyu Wang
IIIS, Tsinghua University
M
Mino Nakura
Robotics Institute, Carnegie Mellon University
P
Pratik Bhowal
Robotics Institute, Carnegie Mellon University
C
Chia-Liang Kuo
Department of Computer Science, National Yang Ming Chiao Tung University
Y
Yi-Ting Chen
Department of Computer Science, National Yang Ming Chiao Tung University
Zackory Erickson
Zackory Erickson
Assistant Professor, Carnegie Mellon University
RoboticsMachine LearningHuman-Robot InteractionAssistive RoboticsSimulation
David Held
David Held
Associate Professor in the Robotics Institute, Carnegie Mellon University
RoboticsComputer VisionMachine LearningDeep LearningReinforcement Learning