🤖 AI Summary
Addressing the challenge in embodied AI that skill transfer for bimanual robotic manipulation heavily relies on high-quality, large-scale human demonstration data, this paper proposes a residual-learning-driven two-stage transfer framework: first pretraining a general-purpose hand trajectory imitator, then fine-tuning a constraint-aware residual motion module to enable efficient sim-to-real skill transfer. The method achieves high-fidelity, high-success-rate execution of complex bimanual tasks—such as pen-cap alignment and bottle-cap unscrewing—on physical robots. Key contributions include: (1) DexManipNet, the first large-scale, extensible bimanual manipulation dataset comprising 3.3K high-fidelity simulated demonstrations; and (2) state-of-the-art performance surpassing prior methods across success rate, motion fidelity, and training efficiency.
📝 Abstract
Human hands play a central role in interacting, motivating increasing research in dexterous robotic manipulation. Data-driven embodied AI algorithms demand precise, large-scale, human-like manipulation sequences, which are challenging to obtain with conventional reinforcement learning or real-world teleoperation. To address this, we introduce ManipTrans, a novel two-stage method for efficiently transferring human bimanual skills to dexterous robotic hands in simulation. ManipTrans first pre-trains a generalist trajectory imitator to mimic hand motion, then fine-tunes a specific residual module under interaction constraints, enabling efficient learning and accurate execution of complex bimanual tasks. Experiments show that ManipTrans surpasses state-of-the-art methods in success rate, fidelity, and efficiency. Leveraging ManipTrans, we transfer multiple hand-object datasets to robotic hands, creating DexManipNet, a large-scale dataset featuring previously unexplored tasks like pen capping and bottle unscrewing. DexManipNet comprises 3.3K episodes of robotic manipulation and is easily extensible, facilitating further policy training for dexterous hands and enabling real-world deployments.