MoDex: A Diffusion Policy for Sequential Multi-Object Dexterous Grasping

📅 2026-06-03

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work addresses the challenge of enabling a single dexterous hand to sequentially grasp multiple objects without releasing previously held items. To this end, the authors propose MoDex, a diffusion-based grasping policy that leverages point cloud perception and conditionally models the oppositional space to dynamically allocate subsets of fingers for each new grasp, preserving redundant degrees of freedom for subsequent manipulation. MoDex is trained in two stages—first via imitation learning and then refined through reinforcement learning—achieving effective sim-to-real transfer on both the MuJoCo simulator and the physical Franka Emika Panda platform. Experimental results demonstrate that MoDex significantly outperforms existing learning-based baselines, improving success rates by 2.92–17.92% in simulation and 6.67–17.78% in real-world trials.

📝 Abstract

This work addresses sequentially grasping multiple objects with a single dexterous hand without releasing those already held. Most dexterous grasping methods commit all of the hand's degrees of freedom to a single object, underutilizing its dexterity and leaving no redundancy for subsequent grasps. The proposed solution, MoDex, is a diffusion policy that predicts the next gripper pose directly from observations, conditioned on an opposition space and point cloud. The opposition space condition specifies which fingers participate in the current grasp, enabling the gripper to use only a subset of its available degrees of freedom while reserving the remaining degrees of freedom for subsequent grasps. To facilitate sim-to-real transfer, MoDex is trained in two stages: first through imitation learning on expert demonstrations, and subsequently through reinforcement learning fine-tuning, which consistently improves success rates over the pre-trained policy. We evaluate MoDex in simulation on a MuJoCo-based Franka Emika Panda robot equipped with an Allegro Hand and on the corresponding real-world hardware platform. Across both simulation and real-world experiments, MoDex achieves higher success rates than the evaluated learning-based baselines, improving performance by 2.92-17.92% and 6.67-17.78%, respectively. Project page: https://modex2026.github.io/.

Problem

Research questions and friction points this paper is trying to address.

dexterous grasping

sequential grasping

multi-object manipulation

opposition space

gripper pose

Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion policy

dexterous grasping

opposition space