Dream-Tac: A Unified Tactile World Action Model for Contact-Rich Robot Manipulation

📅 2026-06-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitations of existing visual world models in contact-rich robotic manipulation tasks, where the absence of tactile feedback constrains performance. To overcome this, the authors propose a unified visuo-tactile world model that jointly predicts actions, future visual observations, and tactile dynamics through a contact-gated multimodal fusion mechanism and a contact-aware attention bias. Additionally, they introduce a cached diffusion acceleration strategy enabling real-time inference at both action and observation levels. Evaluated on six contact-intensive manipulation tasks, the proposed method achieves an average improvement of 31.7% in action prediction accuracy, while accelerating training and inference by factors of 2.9× and 1.8×, respectively.

📝 Abstract

World action models inherit the predictive capability of world models, enabling action generation to be guided by anticipated future observations. However, they rely primarily on vision and often fail in contact-rich manipulation, where critical cues arise from physical interaction. In this paper, we propose Dream-Tac, a unified Tactile-World Action Model that jointly models actions, future visual observations, and tactile dynamics. Specifically, Dream-Tac introduces (i) contact-gated visuotactile fusion to selectively integrate tactile signals and (ii) a contact-aware attention bias to better regulate cross-modal interactions during manipulation. To support real-time deployment, we further design a dual-level acceleration strategy, reformulating the contact-aware bias to preserve the fused attention path during training and introducing cache-based diffusion acceleration at inference, achieving up to 2.9$\times$ faster training and 1.8$\times$ faster inference. Across six contact-rich manipulation tasks, Dream-Tac improves action accuracy by 31.7\% on average, demonstrating the effectiveness of unified visuotactile world modeling.Code is available at https://github.com/LYFCLOUDFAN/Dream-Tac.

Problem

Research questions and friction points this paper is trying to address.

tactile sensing

contact-rich manipulation

world action models

visuotactile fusion

robot manipulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

tactile-world action model

contact-gated visuotactile fusion

contact-aware attention bias

dual-level acceleration

diffusion acceleration

🔎 Similar Papers

Tac-Man: Tactile-Informed Prior-Free Manipulation of Articulated Objects

2024-03-04IEEE Transactions on roboticsCitations: 15

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey

2024-04-28arXiv.orgCitations: 15

💼 Related Jobs

AI Research Scientist, Robotics