DriveAnchor: Progressive Anchor-based Flow Learning for Autonomous Driving Planning

πŸ“… 2026-05-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

197K/year
πŸ€– AI Summary
This work addresses the limitations of insufficient behavioral diversity, poor controllability, and low safety in autonomous driving planning by proposing a three-stage framework. First, it replaces the Gaussian prior with a structured trajectory vocabulary constructed via farthest-point sampling to enhance behavioral diversity. Second, it introduces an energy field based on static road geometry to relocate anchor points, enabling differentiable-free corridor control and improving controllability. Finally, it employs zeroth-order reinforcement learning in anchor-point space for directional search optimization, circumventing complex likelihood computations and ODE-SDE conversions. Evaluated across two million test scenarios, the method reduces near-miss collision rates by 89%, increases average reward by 32%, preserves imitation accuracy, and achieves 2.06 ms per-step inference latency on NVIDIA Drive Orin, with real-vehicle deployment feasibility validated experimentally.
πŸ“ Abstract
We present DriveAnchor, a three-stage framework for autonomous driving planning that achieves behavioral diversity, controllability, and safety in a composable pipeline. Demonstration Flow Pretraining replaces the unstructured Gaussian prior with a vocabulary of 2,398 trajectory shapes constructed by farthest-point sampling, structurally grounding behavioral diversity in vocabulary coverage. Guided Flow Post-training jointly post-trains an Energy Field module with flow matching (FM), conditioning the Energy Field on static road geometry alone, to relocate anchors toward user-specified corridor polygons before flow generation, adding controllability without differentiable guidance; after Stage 2, new corridor presets require only Energy Field updates, not FM retraining. Reward-Refined Flow Fine-tuning applies zeroth-order reinforcement learning to align each anchor's output with collision-avoidance objectives: because the flow-matching model is a deterministic feedforward network in single-step mode, each anchor uniquely determines the output trajectory, reducing reward optimization to a direction search in anchor space without log-likelihood computation or ODE-to-SDE conversion. Evaluated on approximately 2 million held-out driving scenarios, DriveAnchor reduces near-range collision rates by 89% and improves mean reward by 32% without degradation in imitation accuracy, with 2.06 ms inference on NVIDIA Drive Orin. DriveAnchor has been validated through real-world vehicle testing, confirming its practicality for production deployment.
Problem

Research questions and friction points this paper is trying to address.

autonomous driving planning
behavioral diversity
controllability
safety
collision avoidance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow Matching
Anchor-based Planning
Energy Field
Zeroth-order Reinforcement Learning
Behavioral Diversity
πŸ”Ž Similar Papers