🤖 AI Summary
This work addresses the challenge in dexterous manipulation learning where demonstration data often fail to simultaneously capture fine-grained hand–object interactions and ensure deployability. To this end, the authors propose RealDexUMI—a wearable universal manipulation interface based on an isomorphic teleoperation glove. The system employs a shared dexterous end-effector module integrating a lightweight robotic hand, in-palm vision, and fingertip tactile sensing, enabling real-time, redirection-free mapping from human finger motions to robot joint commands. It introduces a novel “zero-gap” data collection paradigm that ensures perfect alignment between data acquisition and deployment phases across motion, contact, tactile, and visual signals. Experiments demonstrate that policies trained with this approach achieve an average success rate of 88.75% across eight real-world tasks, generalize to unseen initial object poses, and successfully transfer to three heterogeneous robot platforms.
📝 Abstract
Learning dexterous manipulation requires demonstrations that preserve fine hand-object interactions while remaining executable at deployment. Existing pipelines either lose deployable dexterity through retargeting or embodiment conversion, or rely on robot-specific teleoperation that is costly to scale and often lacks intuitive, contact-aware control for dexterous data collection. We present RealDexUMI, a wearable universal manipulation interface built around a shared dexterous end-effector module that integrates a lightweight dexterous hand, in-hand vision, and fingertip tactile sensing. A palm-side isomorphic teleoperation glove maps human finger inputs to robot-hand joint commands, enabling real-time, retargeting-free, intuitive, and precise hand control. The shared hand and sensing modules yield zero-gap end-effector data, with matched in-hand observations, tactile signals, contacts, and hand actions between collection and deployment. Across eight real-robot tasks spanning fine-grained, contact-rich, long-horizon, and bimanual manipulation, policies trained on RealDexUMI data achieve an average success rate of 88.75%, generalize to unseen initial poses, and transfer across three embodiments. Website: https://research.beingbeyond.com/realdexumi