🤖 AI Summary
This work addresses the limitations of existing action-chunk-based robotic policies, which employ fixed execution horizons and thus struggle to balance precision in fine manipulation with efficiency in long-horizon tasks, while also lacking real-time replanning capabilities. To overcome these challenges, we propose a dynamic horizon prediction mechanism that, without modifying or fine-tuning pre-trained black-box action chunk policies, uses a lightweight online branch to predict the optimal number of execution steps. This branch is trained via online reinforcement learning and seamlessly integrates with prevailing policy architectures such as diffusion models and vision-language-action models. Experiments demonstrate that our approach significantly improves task success rates across diverse high-precision and long-duration tasks, automatically shortening the horizon during delicate phases and extending it during unconstrained motion—marking the first method to achieve dynamic horizon adaptation for black-box action chunk policies.
📝 Abstract
Action chunking has become a standard design in modern robot policies, from diffusion/flow policies to vision-language-action models, where the policy predicts a sequence of actions and executes a fixed number of them instead of acting one step at a time. However, this paradigm relies on a key assumption: a fixed execution horizon. During chunk execution, the policy operates open-loop, which is particularly problematic for fine-grained manipulation tasks that require frequent replanning. In practice, the execution horizon is typically chosen through empirical tuning and is highly task-dependent. To this end, we propose Dynamic Execution Horizon Prediction (DEHP), an effective method that trains a lightweight execution-horizon prediction branch using online reinforcement learning while keeping the pretrained chunk policy completely frozen. This makes the method compatible with black-box chunk policies and isolates the effect of adapting the execution horizon from changes to the underlying action generator. Across our evaluations, DEHP improves the success rate of different high-precision and long-horizon manipulation tasks by a large margin. Our qualitative analysis further shows that DEHP predicts shorter execution horizons during fine-grained stages of the task and longer horizons during free-space motion. In this way, DEHP balances the efficiency of open-loop chunk execution with the reactivity of closed-loop single-step control. Project page: https://dehp-chunking.github.io/