RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation

📅 2025-10-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing imitation learning datasets contain only successful trajectories, lacking failure and recovery examples, which impairs the out-of-distribution (OOD) generalization of Vision-Language-Action (VLA) models—particularly under minor perturbations that induce policy deviation. To address this, we propose an automated OOD data augmentation framework based on exploratory sampling: first, an offline reinforcement learning–trained action-value network identifies suboptimal actions to deliberately trigger OOD states; second, rollout-guided adaptive exploration generates high-quality failure-recovery trajectories. The framework is fully automated—requiring no human intervention—and seamlessly integrates into any VLA training pipeline. Evaluated on the LIBERO benchmark and real-world robotic tasks, our approach significantly improves model robustness to distributional shifts, operational stability, and cross-scenario generalization performance.

Technology Category

Application Category

📝 Abstract
Vision-Language-Action models (VLAs) have demonstrated remarkable performance on complex robotic manipulation tasks through imitation learning. However, existing imitation learning datasets contain only successful trajectories and lack failure or recovery data, especially for out-of-distribution (OOD) states where the robot deviates from the main policy due to minor perturbations or errors, leading VLA models to struggle with states deviating from the training distribution. To this end, we propose an automated OOD data augmentation framework named RESample through exploratory sampling. Specifically, we first leverage offline reinforcement learning to obtain an action-value network that accurately identifies sub-optimal actions under the current manipulation policy. We further sample potential OOD states from trajectories via rollout, and design an exploratory sampling mechanism that adaptively incorporates these action proxies into the training dataset to ensure efficiency. Subsequently, our framework explicitly encourages the VLAs to recover from OOD states and enhances their robustness against distributional shifts. We conduct extensive experiments on the LIBERO benchmark as well as real-world robotic manipulation tasks, demonstrating that RESample consistently improves the stability and generalization ability of VLA models.
Problem

Research questions and friction points this paper is trying to address.

Augmenting robotic manipulation datasets with failure recovery data
Addressing out-of-distribution state robustness in vision-language-action models
Improving VLA generalization through exploratory sampling of sub-optimal actions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated OOD data augmentation via exploratory sampling
Leverages offline RL to identify sub-optimal actions
Enhances VLA robustness against distributional shifts
🔎 Similar Papers
No similar papers found.
Y
Yuquan Xue
Nanyang Technological University, Singapore
Guanxing Lu
Guanxing Lu
Tsinghua University
VLARLRobotics3D Vision
Z
Zhenyu Wu
Tsinghua University, Beijing, China
Chuanrui Zhang
Chuanrui Zhang
Tsinghua University
Computer Vision
Bofang Jia
Bofang Jia
Nanyang Technological University
Embodied IntelligenceRoboticsComputer Vision
Z
Zhengyi Gu
Nanyang Technological University, Singapore
Y
Yansong Tang
Beijing University of Posts and Telecommunications, Beijing, China
Z
Ziwei Wang
Nanyang Technological University, Singapore