RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing imitation learning datasets contain only successful trajectories, lacking failure and recovery examples, which impairs the out-of-distribution (OOD) generalization of Vision-Language-Action (VLA) models—particularly under minor perturbations that induce policy deviation. To address this, we propose an automated OOD data augmentation framework based on exploratory sampling: first, an offline reinforcement learning–trained action-value network identifies suboptimal actions to deliberately trigger OOD states; second, rollout-guided adaptive exploration generates high-quality failure-recovery trajectories. The framework is fully automated—requiring no human intervention—and seamlessly integrates into any VLA training pipeline. Evaluated on the LIBERO benchmark and real-world robotic tasks, our approach significantly improves model robustness to distributional shifts, operational stability, and cross-scenario generalization performance.

Technology Category

Application Category

📝 Abstract

Vision-Language-Action models (VLAs) have demonstrated remarkable performance on complex robotic manipulation tasks through imitation learning. However, existing imitation learning datasets contain only successful trajectories and lack failure or recovery data, especially for out-of-distribution (OOD) states where the robot deviates from the main policy due to minor perturbations or errors, leading VLA models to struggle with states deviating from the training distribution. To this end, we propose an automated OOD data augmentation framework named RESample through exploratory sampling. Specifically, we first leverage offline reinforcement learning to obtain an action-value network that accurately identifies sub-optimal actions under the current manipulation policy. We further sample potential OOD states from trajectories via rollout, and design an exploratory sampling mechanism that adaptively incorporates these action proxies into the training dataset to ensure efficiency. Subsequently, our framework explicitly encourages the VLAs to recover from OOD states and enhances their robustness against distributional shifts. We conduct extensive experiments on the LIBERO benchmark as well as real-world robotic manipulation tasks, demonstrating that RESample consistently improves the stability and generalization ability of VLA models.

Problem

Research questions and friction points this paper is trying to address.

Augmenting robotic manipulation datasets with failure recovery data

Addressing out-of-distribution state robustness in vision-language-action models

Improving VLA generalization through exploratory sampling of sub-optimal actions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated OOD data augmentation via exploratory sampling

Leverages offline RL to identify sub-optimal actions

Enhances VLA robustness against distributional shifts

🔎 Similar Papers

No similar papers found.

Authors to Follow