Efficient End-to-end Visual Localization for Autonomous Driving with Decoupled BEV Neural Matching

📅 2025-03-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-precision visual localization remains a critical challenge for high-level autonomous driving; conventional map-matching approaches are sensitive to perception noise and rely heavily on manual parameter tuning. This paper proposes an end-to-end neural localization framework that directly regresses the vehicle’s six-degree-of-freedom (6-DoF) pose from surround-view images, eliminating explicit perception–HD-map matching. Its core innovation is a decoupled Bird’s-Eye-View (BEV) neural matching mechanism: it separately models the influence of each pose degree of freedom on the feature space, drastically reducing the dimensionality of differentiable sampling while preserving interpretability, efficiency, and robustness. Experiments on public benchmarks achieve decimeter-level accuracy—0.19 m longitudinal, 0.13 m lateral, and 0.39° heading error—with 68.8% lower inference memory consumption, enabling lightweight, vision-only deployment.

Technology Category

Application Category

📝 Abstract
Accurate localization plays an important role in high-level autonomous driving systems. Conventional map matching-based localization methods solve the poses by explicitly matching map elements with sensor observations, generally sensitive to perception noise, therefore requiring costly hyper-parameter tuning. In this paper, we propose an end-to-end localization neural network which directly estimates vehicle poses from surrounding images, without explicitly matching perception results with HD maps. To ensure efficiency and interpretability, a decoupled BEV neural matching-based pose solver is proposed, which estimates poses in a differentiable sampling-based matching module. Moreover, the sampling space is hugely reduced by decoupling the feature representation affected by each DoF of poses. The experimental results demonstrate that the proposed network is capable of performing decimeter level localization with mean absolute errors of 0.19m, 0.13m and 0.39 degree in longitudinal, lateral position and yaw angle while exhibiting a 68.8% reduction in inference memory usage.
Problem

Research questions and friction points this paper is trying to address.

Accurate vehicle pose estimation for autonomous driving
Reducing sensitivity to perception noise in localization
Efficient and interpretable end-to-end neural network solution
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end neural network for vehicle pose estimation
Decoupled BEV neural matching for efficient localization
Reduced sampling space by decoupling feature representation
🔎 Similar Papers
No similar papers found.
J
Jinyu Miao
School of Vehicle and Mobility, Tsinghua University, Beijing, China
T
Tuopu Wen
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Ziang Luo
Ziang Luo
Tsinghua University
Autonomous driving
K
Kangan Qian
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Zheng Fu
Zheng Fu
Tsinghua university
Y
Yunlong Wang
Autonomous Driving Division of NIO Inc., Beijing, China
Kun Jiang
Kun Jiang
Tsinghua University
autonomous driving
M
Mengmeng Yang
School of Vehicle and Mobility, Tsinghua University, Beijing, China
J
Jin Huang
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Zhihua Zhong
Zhihua Zhong
Chinese Academy of Engineering, Beijing, China
D
Diange Yang
School of Vehicle and Mobility, Tsinghua University, Beijing, China