🤖 AI Summary
This work addresses the limitations of existing visuo-tactile sensors, which compromise visual transparency due to opaque elastomers, and the degraded near-field depth performance of RGB-D cameras. To overcome these challenges, the authors propose TransTac—a binocular visuo-tactile sensor based on a UV-encoded transparent elastomer. By introducing a novel transparent UV-reflective marker design and a prior-guided Delaunay stereo matching algorithm, TransTac simultaneously achieves high-resolution contact geometry reconstruction and enhanced near-field depth sensing within a single compact device. The method substantially improves the robustness of sparse triangulation, increasing matching success rates by 21%, and enables strong cross-modal alignment. It attains a zero-shot tactile recognition accuracy of 83.3%—approximately 50 percentage points higher than opaque baselines—and raises class-center similarity from 0.2 to over 0.77, all with a hardware cost of only about \$70.
📝 Abstract
Vision-based tactile sensors (VBTS) recover high-resolution contact geometry but typically rely on opaque elastomer layers that prevent visual transparency, while RGB-D cameras provide global depth perception yet degrade significantly at close range. To address this limitation, we present TransTac, a transparent ultraviolet (UV)-encoded binocular VBTS that integrates visual observation and marker-based tactile reconstruction within a single compact device. The system employs a transparent elastomer embedded with UV-reflective markers and a prior-guided Delaunay stereo matching algorithm for robust sparse triangulation.
To reliably detect densely distributed semitransparent markers, we develop a lightweight detector that enables stable localization under contact and deformation. The proposed prior-guided Delaunay matching improves correspondence robustness by approximately 21% compared with global assignment baselines while maintaining high reconstruction accuracy. In semantic evaluation, TransTac achieves up to 83.3% zero-shot recognition accuracy on tactile images, exceeding opaque tactile baselines by approximately 50 percentage points. Embedding analysis further reveals substantially stronger cross-modal alignment with natural images, with class-center similarity increasing from around 0.2 to over 0.77. Controlled near-distance experiments quantify the degradation of RGB-D depth reliability and demonstrate extended geometric coverage enabled by visuo-tactile integration. Finally, a compact prototype is implemented with an approximate hardware cost of $70.