Real-Time Execution with Autoregressive Policies

📅 2026-06-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the high rollout latency of autoregressive policies caused by synchronous inference, which hinders their applicability in real-time control requiring low latency and high responsiveness. The authors propose an asynchronous inference framework based on temporal tokenization and constrained decoding, enabling strict latency control and parallel multi-trajectory decoding for the first time in autoregressive policies. While preserving action smoothness, the method significantly improves response speed without sacrificing the fast convergence and strong generalization inherent to autoregressive approaches. Evaluated in both simulation and real-world environments, the proposed framework outperforms state-of-the-art flow-matching policies of comparable scale, achieving substantially higher task completion efficiency while satisfying stringent real-time constraints.

📝 Abstract

Real-time execution, enabled by asynchronous inference that ensures both smooth action trajectories and fast reactivity, is critical for realistic deployments of large-scale Vision-Language-Action models. However, recent work on real-time execution primarily focuses on variants of diffusion policies, even though it is more critical for autoregressive policies given their slower rollout speed in synchronous inference. In contrast, we demonstrate that autoregressive policies can achieve real-time execution by adjusting the tokenization horizon and applying constrained decoding, thereby guaranteeing strict latency bounds that enable multi-trajectory decoding to maximize performance. Across simulated and real-world environments, we find that the autoregressive policy consistently outperforms its equivalent-level flow-matching policy counterpart while achieving significantly improved task completion speeds from synchronous inference. Coupled with the inherent advantages of autoregressive policies, such as faster convergence and better generalizability in instruction-following, these results confirm that autoregressive policies can remain a competitive policy type supporting real-time execution.

Problem

Research questions and friction points this paper is trying to address.

real-time execution

autoregressive policies

synchronous inference

latency bounds

Vision-Language-Action models

Innovation

Methods, ideas, or system contributions that make the work stand out.

autoregressive policies

real-time execution

constrained decoding