ScaleADFG: Affordance-based Dexterous Functional Grasping via Scalable Dataset

📅 2025-11-12
🏛️ IEEE Robotics and Automation Letters
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address key challenges in robotic dexterous grasping—namely, insufficient training data, poor generalization across object scales, and hand-object size mismatch—this paper proposes an automated grasp-data construction pipeline and a lightweight, function-driven grasp generation network. Methodologically: (1) we introduce a demonstration-free grasp synthesis algorithm grounded in functional attributes, enabling robust handling of arbitrary hand-to-object size ratios; (2) we integrate pretrained 3D generative models with functional region retrieval for efficient asset construction; and (3) we employ a single-stage lightweight network with end-to-end differentiable loss, eliminating post-processing. Our contributions include: (i) a large-scale grasp dataset covering five object categories, over 1,000 shapes × 15 scales, with ≥60K grasps per robotic hand; and (ii) empirical validation—both in simulation and on real robots—of cross-scale zero-shot transfer, high stability, and diverse grasp generation.

Technology Category

Application Category

📝 Abstract
Dexterous functional tool-use grasping is essential for effective robotic manipulation of tools. However, existing approaches face significant challenges in efficiently constructing large-scale datasets and ensuring generalizability to everyday object scales. These issues primarily arise from size mismatches between robotic and human hands, and the diversity in real-world object scales. To address these limitations, we propose the ScaleADFG framework, which consists of a fully automated dataset construction pipeline and a lightweight grasp generation network. Our dataset introduce an affordance-based algorithm to synthesize diverse tool-use grasp configurations without expert demonstrations, allowing flexible object-hand size ratios and enabling large robotic hands (compared to human hands) to grasp everyday objects effectively. Additionally, we leverage pre-trained models to generate extensive 3D assets and facilitate efficient retrieval of object affordances. Our dataset comprising five object categories, each containing over 1,000 unique shapes with 15 scale variations. After filtering, the dataset includes over 60,000 grasps for each 2 dexterous robotic hands. On top of this dataset, we train a lightweight, single-stage grasp generation network with a notably simple loss design, eliminating the need for post-refinement. This demonstrates the critical importance of large-scale datasets and multi-scale object variant for effective training. Extensive experiments in simulation and on real robot confirm that the ScaleADFG framework exhibits strong adaptability to objects of varying scales, enhancing functional grasp stability, diversity, and generalizability. Moreover, our network exhibits effective zero-shot transfer to real-world objects. Project page is available at https://sizhe-wang.github.io/ScaleADFG_webpage
Problem

Research questions and friction points this paper is trying to address.

Addresses efficient construction of large-scale dexterous grasping datasets
Solves generalizability challenges for robotic grasping of everyday objects
Overcomes size mismatch between robotic hands and diverse object scales
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated pipeline synthesizes diverse tool-use grasps affordance-based
Lightweight single-stage network generates grasps without post-refinement
Scalable dataset with multi-scale objects enables zero-shot transfer
🔎 Similar Papers
No similar papers found.
Sizhe Wang
Sizhe Wang
Washington University in Saint Louis
LLM NLP
Y
Yifan Yang
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190. China
Yongkang Luo
Yongkang Luo
Institute of Automation, Chinese Academy of Sciences
Robot LearningDexterous ManipulationIntelligence SystemComputer Vision
D
Daheng Li
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190. China
W
Wei Wei
CasiaHand Robotics Co., Ltd, Nanjing 211100, China
Y
Yan Zhang
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190. China
P
Peiying Hu
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190. China
Y
Yunjin Fu
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190. China
H
Haonan Duan
SenseTime Research, SenseTime, Shanghai 200233, China
Jia Sun
Jia Sun
Hong Kong University of Science and Technology (Guangzhou)
Media arts
P
Peng Wang
State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190. China