Efficient End-to-end Visual Localization for Autonomous Driving with Decoupled BEV Neural Matching

📅 2025-03-02

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

High-precision visual localization remains a critical challenge for high-level autonomous driving; conventional map-matching approaches are sensitive to perception noise and rely heavily on manual parameter tuning. This paper proposes an end-to-end neural localization framework that directly regresses the vehicle’s six-degree-of-freedom (6-DoF) pose from surround-view images, eliminating explicit perception–HD-map matching. Its core innovation is a decoupled Bird’s-Eye-View (BEV) neural matching mechanism: it separately models the influence of each pose degree of freedom on the feature space, drastically reducing the dimensionality of differentiable sampling while preserving interpretability, efficiency, and robustness. Experiments on public benchmarks achieve decimeter-level accuracy—0.19 m longitudinal, 0.13 m lateral, and 0.39° heading error—with 68.8% lower inference memory consumption, enabling lightweight, vision-only deployment.

Technology Category

Application Category

📝 Abstract

Accurate localization plays an important role in high-level autonomous driving systems. Conventional map matching-based localization methods solve the poses by explicitly matching map elements with sensor observations, generally sensitive to perception noise, therefore requiring costly hyper-parameter tuning. In this paper, we propose an end-to-end localization neural network which directly estimates vehicle poses from surrounding images, without explicitly matching perception results with HD maps. To ensure efficiency and interpretability, a decoupled BEV neural matching-based pose solver is proposed, which estimates poses in a differentiable sampling-based matching module. Moreover, the sampling space is hugely reduced by decoupling the feature representation affected by each DoF of poses. The experimental results demonstrate that the proposed network is capable of performing decimeter level localization with mean absolute errors of 0.19m, 0.13m and 0.39 degree in longitudinal, lateral position and yaw angle while exhibiting a 68.8% reduction in inference memory usage.

Problem

Research questions and friction points this paper is trying to address.

Accurate vehicle pose estimation for autonomous driving

Reducing sensitivity to perception noise in localization

Efficient and interpretable end-to-end neural network solution

Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end neural network for vehicle pose estimation

Decoupled BEV neural matching for efficient localization

Reduced sampling space by decoupling feature representation

🔎 Similar Papers

U-BEV: Height-aware Bird’s-Eye-View Segmentation and Neural Map-based Relocalization