Low Fidelity Visuo-Tactile Pretraining Improves Vision-Only Manipulation Performance

πŸ“… 2024-06-21
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limited generalization and robustness of vision-only manipulation policies in complex embodied tasks. We propose a low-cost tactile-augmented visuo-tactile pretraining paradigm. Methodologically, we employ the open-source, low-fidelity BeadSight tactile sensor to capture coarse-grained tactile signals, then perform multi-task imitation learning with a shared encoder architecture across tasks. Crucially, downstream execution requires only visual inputβ€”no runtime tactile feedback is needed. Our key contribution is the first empirical demonstration that pretraining with low-precision tactile signals alone significantly enhances vision-only policy performance: success rates improve by 65% on USB insertion/removal, with consistent gains on cross-domain tasks including drawer opening and long-horizon object placement. This establishes a novel pathway toward cost-effective, robust embodied intelligence.

Technology Category

Application Category

πŸ“ Abstract
Tactile perception is essential for real-world manipulation tasks, yet the high cost and fragility of tactile sensors can limit their practicality. In this work, we explore BeadSight (a low-cost, open-source tactile sensor) alongside a tactile pre-training approach, an alternative method to precise, pre-calibrated sensors. By pre-training with the tactile sensor and then disabling it during downstream tasks, we aim to enhance robustness and reduce costs in manipulation systems. We investigate whether tactile pre-training, even with a low-fidelity sensor like BeadSight, can improve the performance of an imitation learning agent on complex manipulation tasks. Through visuo-tactile pre-training on both similar and dissimilar tasks, we analyze its impact on a longer-horizon downstream task. Our experiments show that visuo-tactile pre-training improved performance on a USB cable plugging task by up to 65% with vision-only inference. Additionally, on a longer-horizon drawer pick-and-place task, pre-training--whether on a similar, dissimilar, or identical task--consistently improved performance, highlighting the potential for a large-scale visuo-tactile pre-trained encoder.
Problem

Research questions and friction points this paper is trying to address.

Explores low-cost tactile sensors for manipulation tasks.
Investigates tactile pre-training to enhance vision-only manipulation.
Analyzes impact of visuo-tactile pre-training on complex tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-cost tactile sensor BeadSight used
Visuo-tactile pre-training enhances manipulation performance
Pre-trained encoder improves vision-only task efficiency
πŸ”Ž Similar Papers
No similar papers found.
S
Selam Gano
Department of Mechanical Engineering, Carnegie Mellon University United States
Abraham George
Abraham George
PhD Student at Carnegie Mellon University
roboticsreinforcement learningrobotic manipulation
A
A. Farimani
Department of Mechanical Engineering, Carnegie Mellon University United States