Dynamic Execution Horizon Prediction for Chunk-based Robot Policies

📅 2026-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing action-chunk-based robotic policies, which employ fixed execution horizons and thus struggle to balance precision in fine manipulation with efficiency in long-horizon tasks, while also lacking real-time replanning capabilities. To overcome these challenges, we propose a dynamic horizon prediction mechanism that, without modifying or fine-tuning pre-trained black-box action chunk policies, uses a lightweight online branch to predict the optimal number of execution steps. This branch is trained via online reinforcement learning and seamlessly integrates with prevailing policy architectures such as diffusion models and vision-language-action models. Experiments demonstrate that our approach significantly improves task success rates across diverse high-precision and long-duration tasks, automatically shortening the horizon during delicate phases and extending it during unconstrained motion—marking the first method to achieve dynamic horizon adaptation for black-box action chunk policies.
📝 Abstract
Action chunking has become a standard design in modern robot policies, from diffusion/flow policies to vision-language-action models, where the policy predicts a sequence of actions and executes a fixed number of them instead of acting one step at a time. However, this paradigm relies on a key assumption: a fixed execution horizon. During chunk execution, the policy operates open-loop, which is particularly problematic for fine-grained manipulation tasks that require frequent replanning. In practice, the execution horizon is typically chosen through empirical tuning and is highly task-dependent. To this end, we propose Dynamic Execution Horizon Prediction (DEHP), an effective method that trains a lightweight execution-horizon prediction branch using online reinforcement learning while keeping the pretrained chunk policy completely frozen. This makes the method compatible with black-box chunk policies and isolates the effect of adapting the execution horizon from changes to the underlying action generator. Across our evaluations, DEHP improves the success rate of different high-precision and long-horizon manipulation tasks by a large margin. Our qualitative analysis further shows that DEHP predicts shorter execution horizons during fine-grained stages of the task and longer horizons during free-space motion. In this way, DEHP balances the efficiency of open-loop chunk execution with the reactivity of closed-loop single-step control. Project page: https://dehp-chunking.github.io/
Problem

Research questions and friction points this paper is trying to address.

execution horizon
action chunking
robot policies
fine-grained manipulation
open-loop execution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Execution Horizon
Chunk-based Policy
Online Reinforcement Learning
Open-loop Execution
Adaptive Control
Y
Yuchi Zhao
Department of Computer Science, University of Toronto, 40 St George St., Toronto, ON M5S 2E4, Canada; Vector Institute for Artificial Intelligence, W1140-108 College St., Schwartz Reisman Innovation Campus, Toronto, ON M5G 0C6, Canada
Miroslav Bogdanovic
Miroslav Bogdanovic
University of Toronto
Reinforcement LearningDeep LearningRobotics
A
Arjun Sohal
Department of Computer Science, University of Toronto, 40 St George St., Toronto, ON M5S 2E4, Canada
L
Liyu Tao
Department of Computer Science, University of Toronto, 40 St George St., Toronto, ON M5S 2E4, Canada
Kourosh Darvish
Kourosh Darvish
Scientist, University of Toronto
Robot LearningShared AutonomyHuman-Robot CollaborationHumanoid Robot Teleoperation
A
Alán Aspuru-Guzik
Department of Computer Science, University of Toronto, 40 St George St., Toronto, ON M5S 2E4, Canada; Vector Institute for Artificial Intelligence, W1140-108 College St., Schwartz Reisman Innovation Campus, Toronto, ON M5G 0C6, Canada; Acceleration Consortium, 700 University Ave., Toronto, ON M7A 2S4, Canada; Department of Chemistry, University of Toronto, 80 St. George St., Toronto, ON M5S 3H6, Canada; Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON M5S 3E
Florian Shkurti
Florian Shkurti
Assistant Professor, Computer Science, University of Toronto
RoboticsMachine LearningComputer VisionArtificial Intelligence
Animesh Garg
Animesh Garg
Georgia Institute of Technology, University of Toronto
Robotic ManipulationRobot LearningReinforcement LearningMachine LearningComputer Vision