RL-Driven Data Generation for Robust Vision-Based Dexterous Grasping

📅 2025-04-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Vision-action models for dexterous grasping suffer from poor generalization and instability when adapting to objects with diverse geometries. Method: This paper proposes a reinforcement learning–driven simulation data generation framework. It features a modular architecture combining parameterized reference trajectories with a PPO-based residual policy, balancing demonstration fidelity and cross-object geometric adaptability; integrates contact-aware trajectory modeling and vision-conditioned policy networks to automatically synthesize high-quality, contact-rich, geometry-diverse grasping trajectories in simulation; and leverages real-to-sim-to-real transfer alongside simulation data augmentation. Contribution/Results: Experiments demonstrate significant improvements in zero-shot grasping success rates on unseen objects, effectively mitigating the training bottleneck caused by scarcity of real-world grasping data.

Technology Category

Application Category

📝 Abstract

This work presents reinforcement learning (RL)-driven data augmentation to improve the generalization of vision-action (VA) models for dexterous grasping. While real-to-sim-to-real frameworks, where a few real demonstrations seed large-scale simulated data, have proven effective for VA models, applying them to dexterous settings remains challenging: obtaining stable multi-finger contacts is nontrivial across diverse object shapes. To address this, we leverage RL to generate contact-rich grasping data across varied geometries. In line with the real-to-sim-to-real paradigm, the grasp skill is formulated as a parameterized and tunable reference trajectory refined by a residual policy learned via RL. This modular design enables trajectory-level control that is both consistent with real demonstrations and adaptable to diverse object geometries. A vision-conditioned policy trained on simulation-augmented data demonstrates strong generalization to unseen objects, highlighting the potential of our approach to alleviate the data bottleneck in training VA models.

Problem

Research questions and friction points this paper is trying to address.

Improving vision-action models for dexterous grasping generalization

Generating contact-rich grasping data for diverse object shapes

Alleviating data bottleneck in training vision-action models

Innovation

Methods, ideas, or system contributions that make the work stand out.

RL-driven data augmentation for grasping

Parameterized reference trajectory refinement

Vision-conditioned policy for generalization

🔎 Similar Papers

No similar papers found.

Authors to Follow