🤖 AI Summary
This work addresses the limitations of existing diffusion-based dexterous grasp generation methods, which rely on stochastic differential equations and consequently suffer from excessive sampling steps, unstable trajectories, and physically implausible grasp poses. To overcome these issues, the authors propose a deterministic ordinary differential equation framework grounded in flow matching, enabling efficient and stable sampling through smooth probability flows. Furthermore, they introduce a training-free physical energy guidance mechanism that dynamically steers the inference trajectory toward physically feasible regions. Evaluated across five benchmark datasets, the proposed method significantly outperforms current approaches in both grasp quality and physical plausibility while drastically reducing the number of required sampling steps.
📝 Abstract
Denoising generative models have recently become the dominant paradigm for dexterous grasp generation, owing to their ability to model complex grasp distributions from large-scale data. However, existing diffusion-based methods typically formulate generation as a stochastic differential equation (SDE), which often requires many sequential denoising steps and introduces trajectory instability that can lead to physically infeasible grasps. In this paper, we propose EFF-Grasp, a novel Flow-Matching-based framework for physics-aware dexterous grasp generation. Specifically, we reformulate grasp synthesis as a deterministic ordinary differential equation (ODE) process, which enables efficient and stable generation through smooth probability flows. To further enforce physical feasibility, we introduce a training-free physics-aware energy guidance strategy. Our method defines an energy-guided target distribution using adapted explicit physical energy functions that capture key grasp constraints, and estimates the corresponding guidance term via a local Monte Carlo approximation during inference. In this way, EFF-Grasp dynamically steers the generation trajectory toward physically feasible regions without requiring additional physics-based training or simulation feedback. Extensive experiments on five benchmark datasets show that EFF-Grasp achieves superior performance in grasp quality and physical feasibility, while requiring substantially fewer sampling steps than diffusion-based baselines.