DriveAnchor: Progressive Anchor-based Flow Learning for Autonomous Driving Planning

📅 2026-05-30

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the limitations of insufficient behavioral diversity, poor controllability, and low safety in autonomous driving planning by proposing a three-stage framework. First, it replaces the Gaussian prior with a structured trajectory vocabulary constructed via farthest-point sampling to enhance behavioral diversity. Second, it introduces an energy field based on static road geometry to relocate anchor points, enabling differentiable-free corridor control and improving controllability. Finally, it employs zeroth-order reinforcement learning in anchor-point space for directional search optimization, circumventing complex likelihood computations and ODE-SDE conversions. Evaluated across two million test scenarios, the method reduces near-miss collision rates by 89%, increases average reward by 32%, preserves imitation accuracy, and achieves 2.06 ms per-step inference latency on NVIDIA Drive Orin, with real-vehicle deployment feasibility validated experimentally.

📝 Abstract

We present DriveAnchor, a three-stage framework for autonomous driving planning that achieves behavioral diversity, controllability, and safety in a composable pipeline. Demonstration Flow Pretraining replaces the unstructured Gaussian prior with a vocabulary of 2,398 trajectory shapes constructed by farthest-point sampling, structurally grounding behavioral diversity in vocabulary coverage. Guided Flow Post-training jointly post-trains an Energy Field module with flow matching (FM), conditioning the Energy Field on static road geometry alone, to relocate anchors toward user-specified corridor polygons before flow generation, adding controllability without differentiable guidance; after Stage 2, new corridor presets require only Energy Field updates, not FM retraining. Reward-Refined Flow Fine-tuning applies zeroth-order reinforcement learning to align each anchor's output with collision-avoidance objectives: because the flow-matching model is a deterministic feedforward network in single-step mode, each anchor uniquely determines the output trajectory, reducing reward optimization to a direction search in anchor space without log-likelihood computation or ODE-to-SDE conversion. Evaluated on approximately 2 million held-out driving scenarios, DriveAnchor reduces near-range collision rates by 89% and improves mean reward by 32% without degradation in imitation accuracy, with 2.06 ms inference on NVIDIA Drive Orin. DriveAnchor has been validated through real-world vehicle testing, confirming its practicality for production deployment.

Problem

Research questions and friction points this paper is trying to address.

autonomous driving planning

behavioral diversity

controllability

safety

collision avoidance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow Matching

Anchor-based Planning

Energy Field