RT-cache: Efficient Robot Trajectory Retrieval System

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Existing vision-language-action (VLA) models suffer from high single-step inference latency—up to minutes—severely hindering real-time robotic deployment. To address this, we propose RT-cache, the first trajectory memory and retrieval framework tailored for VLA models. RT-cache constructs a scalable cache of multi-step motion primitives and enables scene-aware, cross-task trajectory retrieval to efficiently reuse historically successful experiences. Its core innovation is a trajectory memory pipeline supporting large-scale unsupervised experience accumulation, semantic-aligned cross-scene trajectory replay, and rapid zero-shot adaptation with minimal new samples. Evaluated on real-world benchmarks including Open-X Embodiment, RT-cache reduces average task completion time by 58% and improves success rate by 12.3% over retrieval-free baselines, demonstrating substantial gains in both efficiency and effectiveness.

Technology Category

Application Category

📝 Abstract

This paper introduces RT-cache, a novel trajectorymemory pipeline that accelerates real-world robot inference by leveraging big-data retrieval and learning from experience. While modern Vision-Language-Action (VLA) models can handle diverse robotic tasks, they often incur high per-step inference costs, resulting in significant latency, sometimes minutes per task. In contrast, RT-cache stores a large-scale Memory of previously successful robot trajectories and retrieves relevant multistep motion snippets, drastically reducing inference overhead. By integrating a Memory Builder with a Trajectory Retrieval, we develop an efficient retrieval process that remains tractable even for extremely large datasets. RT-cache flexibly accumulates real-world experiences and replays them whenever the current scene matches past states, adapting quickly to new or unseen environments with only a few additional samples. Experiments on the Open-X Embodiment Dataset and other real-world data demonstrate that RT-cache completes tasks both faster and more successfully than a baseline lacking retrieval, suggesting a practical, data-driven solution for real-time manipulation.

Problem

Research questions and friction points this paper is trying to address.

Reduces high per-step inference costs in robot tasks

Retrieves relevant motion snippets from past trajectories

Adapts quickly to new environments with minimal samples

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages big-data retrieval for efficient inference

Stores and retrieves multistep motion snippets

Integrates Memory Builder with Trajectory Retrieval

🔎 Similar Papers

No similar papers found.

Toyota Research Institute

Los Altos, CA / Cambridge, MA

Robotics Autonomy Engineer-Planning and Control

Field AI

Irvine, CA

AI Research Scientist, Robotics