MapRF: Weakly Supervised Online HD Map Construction via NeRF-Guided Self-Training

📅 2025-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of online high-definition (HD) map construction—namely, heavy reliance on costly 3D annotations and limited generalizability and scalability—this paper proposes a weakly supervised NeRF-guided self-training framework. Methodologically, it leverages onboard 2D semantic image labels and introduces a Map-to-Ray Matching strategy to achieve cross-modal geometric-semantic alignment; iteratively refined, view-consistent 3D pseudo-labels mitigate error accumulation during self-training. Innovatively integrating Neural Radiance Fields (NeRF) modeling with self-training, the framework jointly reconstructs 3D geometry and semantics using only 2D supervision. Evaluated on Argoverse 2 and nuScenes, it achieves approximately 75% of the performance of fully supervised methods and significantly outperforms existing weakly supervised approaches. This work is the first to empirically validate the feasibility and effectiveness of online HD mapping driven solely by 2D labels.

Technology Category

Application Category

📝 Abstract
Autonomous driving systems benefit from high-definition (HD) maps that provide critical information about road infrastructure. The online construction of HD maps offers a scalable approach to generate local maps from on-board sensors. However, existing methods typically rely on costly 3D map annotations for training, which limits their generalization and scalability across diverse driving environments. In this work, we propose MapRF, a weakly supervised framework that learns to construct 3D maps using only 2D image labels. To generate high-quality pseudo labels, we introduce a novel Neural Radiance Fields (NeRF) module conditioned on map predictions, which reconstructs view-consistent 3D geometry and semantics. These pseudo labels are then iteratively used to refine the map network in a self-training manner, enabling progressive improvement without additional supervision. Furthermore, to mitigate error accumulation during self-training, we propose a Map-to-Ray Matching strategy that aligns map predictions with camera rays derived from 2D labels. Extensive experiments on the Argoverse 2 and nuScenes datasets demonstrate that MapRF achieves performance comparable to fully supervised methods, attaining around 75% of the baseline while surpassing several approaches using only 2D labels. This highlights the potential of MapRF to enable scalable and cost-effective online HD map construction for autonomous driving.
Problem

Research questions and friction points this paper is trying to address.

Online HD map construction requires costly 3D annotations limiting scalability
Weakly supervised learning using only 2D image labels for 3D mapping
Mitigating error accumulation in self-training for map prediction refinement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Weakly supervised framework using only 2D image labels
NeRF module reconstructs 3D geometry and semantics
Self-training with Map-to-Ray Matching reduces errors
🔎 Similar Papers
No similar papers found.