KPGrasp: Scalable Keypoint Flow Matching for Dexterous Grasp Generation

📅 2026-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes KPGrasp, a flow-matching-based framework for dexterous grasp generation that overcomes the reliance of existing learning-based methods on meticulously tuned contact losses or costly test-time optimization. Departing from conventional hybrid representations combining SE(3) poses and joint angles, KPGrasp employs a unified, fully Euclidean parameterization of 3D hand keypoints, enabling native spatial reasoning. It is the first to apply pure flow-matching loss to scalable dexterous grasping. Leveraging large-scale data and a Transformer-based flow model, KPGrasp efficiently generates high-quality grasps without explicit contact constraints, achieving a 76.3% success rate on Dexonomy—47.4% higher than the strongest baseline—with a minimal penetration depth of 2.4 mm. It attains state-of-the-art average performance on DexGrasp Anything without fine-tuning and requires only 0.032 seconds per batch inference, demonstrating successful real-world deployment on 20 diverse objects.
📝 Abstract
Generating high-quality dexterous grasps remains challenging for learning-based methods, which often depend on carefully tuned contact losses or costly contact-based test-time refinement. We present KPGrasp, a flow-matching framework that learns dexterous grasp priors from large-scale data rather than relying on contact losses or contact-based test-time refinement. KPGrasp couples an all-Euclidean 3D hand-keypoint parameterization with a simple yet scalable Transformer flow model. The parameterization avoids the drawbacks of the conventional mixed SE(3) pose and joint-angle output space, expresses grasps in the same frame as the object point cloud, and thus enables native spatial reasoning; the Transformer flow model is trained with only the standard flow-matching loss and scales effectively with data, model capacity, and batch size. Experiments demonstrate state-of-the-art performance on two simulation benchmarks. On the Dexonomy benchmark, it reaches a 76.3% grasp success rate, improving over the strongest directly comparable baseline by 47.4% while reducing penetration depth to 2.4 mm. The same model also achieves the best average performance on the DexGrasp Anything benchmark without fine-tuning. For batched inference, KPGrasp requires only 0.032 s per grasp. Finally, real-world experiments on 20 diverse objects demonstrate that the pipeline can be deployed in a real-world setup.
Problem

Research questions and friction points this paper is trying to address.

dexterous grasp generation
contact losses
test-time refinement
grasp quality
learning-based methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

flow matching
keypoint parameterization
dexterous grasp generation
Transformer model
contact-free learning
🔎 Similar Papers
No similar papers found.