Multi-fingered Robotic Hand Grasping in Cluttered Environments through Hand-object Contact Semantic Mapping

📅 2024-04-12
🏛️ arXiv.org
📈 Citations: 7
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient contact information modeling in multi-finger dexterous hand grasping under cluttered scenes, this paper proposes a grasp generation method based on hand-object contact semantic mapping. Our contributions are threefold: (1) We introduce the Contact Semantic Conditional Variational Autoencoder (CoSe-CVAE), the first point-cloud-driven, generalizable contact semantic map generator; (2) We design a unified evaluation model jointly optimizing grasp quality and collision probability; (3) We release the first high-diversity, multimodal grasping dataset specifically tailored for multi-finger hands. Experiments demonstrate real-world single-object and cluttered-scene grasp success rates of 81.0% and 75.3%, respectively—surpassing state-of-the-art methods by over 4.65%. Both code and dataset are publicly available.

Technology Category

Application Category

📝 Abstract
The deep learning models has significantly advanced dexterous manipulation techniques for multi-fingered hand grasping. However, the contact information-guided grasping in cluttered environments remains largely underexplored. To address this gap, we have developed a method for generating multi-fingered hand grasp samples in cluttered settings through contact semantic map. We introduce a contact semantic conditional variational autoencoder network (CoSe-CVAE) for creating comprehensive contact semantic map from object point cloud. We utilize grasp detection method to estimate hand grasp poses from the contact semantic map. Finally, an unified grasp evaluation model is designed to assess grasp quality and collision probability, substantially improving the reliability of identifying optimal grasps in cluttered scenarios. Our grasp generation method has demonstrated remarkable success, outperforming state-of-the-art methods by at least 4.65% with 81.0% average grasping success rate in real-world single-object environment and 75.3% grasping success rate in cluttered scenes. We also proposed the multi-modal multi-fingered grasping dataset generation method. Our multi-fingered hand grasping dataset outperforms previous datasets in scene diversity, modality diversity. The dataset, code and supplementary materials can be found at https://sites.google.com/view/ffh-cluttered-grasping.
Problem

Research questions and friction points this paper is trying to address.

Generating multi-fingered hand grasps in cluttered environments
Creating contact semantic maps from object point clouds
Evaluating grasp quality and collision probability reliably
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates grasp samples via contact semantic map
Uses CoSe-CVAE for contact semantic mapping
Evaluates grasps with PointNetGPD++ model
🔎 Similar Papers
No similar papers found.
L
Lei Zhang
TAMS (Technical Aspects of Multimodal Systems), Department of Informatics, Universität Hamburg
K
Kaixin Bai
TAMS (Technical Aspects of Multimodal Systems), Department of Informatics, Universität Hamburg
G
Guowen Huang
Agile Robots AG
Z
Zhaopeng Chen
Agile Robots AG
J
Jianwei Zhang
TAMS (Technical Aspects of Multimodal Systems), Department of Informatics, Universität Hamburg