Implicit Drifting Policy: One-Step Action Generation via Conditional Expert Geometry

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Existing one-step generative imitation learning methods struggle to effectively constrain the action manifold due to their neglect of trajectory evolution dynamics during training, leading to degraded control accuracy. This work proposes a one-shot action generation framework that obviates the need for explicit drift field estimation by adaptively constructing a scalar potential function through comparison between local conditional expert geometry and global reference geometry, thereby implicitly correcting drift. To our knowledge, this is the first approach to incorporate an implicit drift-correction mechanism during training within a one-step policy, circumventing the ill-posedness inherent in explicit vector field estimation. Experiments demonstrate that the proposed method significantly outperforms explicit drift-based approaches across 2D, 3D, and real-world robotic tasks, while matching the performance of strong one-step baselines, thus enhancing both accuracy and stability in action generation.

📝 Abstract

Generative action policies based on diffusion or flow matching excel in behavior cloning, yet their iterative sampling is prohibitive for high-frequency robot control. While recent one-step formulations alleviate this latency, they inevitably discard the intermediate trajectory evolution that provides crucial action correction. Directly recovering this mechanism by explicitly estimating a training-time drifting field is mathematically ill-posed due to extreme conditional demonstration sparsity. We introduce Implicit Drifting Policy (IDP), a one-step imitation learning framework that brings the training-time correction of Drifting into policy learning without explicit vector field estimation. IDP extracts a conditional expert geometry from the local variation of observation-similar expert actions, and compares it against a global reference geometry to isolate condition-specific constraints. This local geometric structure adaptively weights a scalar potential objective. Combined with an expert-proximal terminal evaluation, IDP directly enforces manifold constraints on the one-step generator during training. Extensive evaluations across 2D, 3D, and real-world manipulation tasks show IDP effectively maintains adherence to valid action manifolds, improving upon explicit drifting methods and achieving competitive performance with strong one-step baselines.

Problem

Research questions and friction points this paper is trying to address.

one-step action generation

imitation learning

drifting policy

conditional expert geometry

action manifold

Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit Drifting Policy

one-step action generation

conditional expert geometry