RL-Driven Data Generation for Robust Vision-Based Dexterous Grasping

๐Ÿ“… 2025-04-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Vision-action models for dexterous grasping suffer from poor generalization and instability when adapting to objects with diverse geometries. Method: This paper proposes a reinforcement learningโ€“driven simulation data generation framework. It features a modular architecture combining parameterized reference trajectories with a PPO-based residual policy, balancing demonstration fidelity and cross-object geometric adaptability; integrates contact-aware trajectory modeling and vision-conditioned policy networks to automatically synthesize high-quality, contact-rich, geometry-diverse grasping trajectories in simulation; and leverages real-to-sim-to-real transfer alongside simulation data augmentation. Contribution/Results: Experiments demonstrate significant improvements in zero-shot grasping success rates on unseen objects, effectively mitigating the training bottleneck caused by scarcity of real-world grasping data.

Technology Category

Application Category

๐Ÿ“ Abstract
This work presents reinforcement learning (RL)-driven data augmentation to improve the generalization of vision-action (VA) models for dexterous grasping. While real-to-sim-to-real frameworks, where a few real demonstrations seed large-scale simulated data, have proven effective for VA models, applying them to dexterous settings remains challenging: obtaining stable multi-finger contacts is nontrivial across diverse object shapes. To address this, we leverage RL to generate contact-rich grasping data across varied geometries. In line with the real-to-sim-to-real paradigm, the grasp skill is formulated as a parameterized and tunable reference trajectory refined by a residual policy learned via RL. This modular design enables trajectory-level control that is both consistent with real demonstrations and adaptable to diverse object geometries. A vision-conditioned policy trained on simulation-augmented data demonstrates strong generalization to unseen objects, highlighting the potential of our approach to alleviate the data bottleneck in training VA models.
Problem

Research questions and friction points this paper is trying to address.

Improving vision-action models for dexterous grasping generalization
Generating contact-rich grasping data for diverse object shapes
Alleviating data bottleneck in training vision-action models
Innovation

Methods, ideas, or system contributions that make the work stand out.

RL-driven data augmentation for grasping
Parameterized reference trajectory refinement
Vision-conditioned policy for generalization
๐Ÿ”Ž Similar Papers
No similar papers found.