🤖 AI Summary
This work addresses the challenge of urban-scale lane-level map construction, which traditionally relies heavily on manual effort and suffers from existing end-to-end methods’ inability to explicitly adhere to cartographic standards and traffic regulations—often yielding rule-violating outputs in complex scenarios. To overcome this, we propose MapAgent, the first industrial-grade framework that integrates explicit rule validation with an agent-based architecture. MapAgent employs a Judge-Planner-Worker closed-loop system that selectively triggers vision-language joint diagnosis, constraint-aware reasoning, and deterministic editing in low-confidence regions. The approach achieves high throughput while substantially reducing the need for human post-editing, outperforming strong baselines—particularly in complex and long-tail cases—on real-world data. Deployed in Baidu Maps, the system now enables automated generation of lane-level maps across more than 360 cities in China with over 95% automation rate.
📝 Abstract
Lane-level maps are critical infrastructure for autonomous driving and lane-level navigation, yet constructing and maintaining standardized lane networks for hundreds of cities remains highly labor-intensive. Recent end-to-end vectorized mapping methods can predict lane geometry and topology directly from sensor data, but they typically treat mapping specifications and traffic regulations as implicit, dataset-dependent supervision. Moreover, in complex scenes (e.g., worn or missing markings and occlusions), correct lane configurations are often under-determined by visual evidence alone, making specification violations a major source of human post-editing. We propose MapAgent, an industrial-grade agentic architecture that augments a vectorization backbone for specification-compliant lane-map production. Rather than merely adding an agent loop to map prediction, MapAgent couples backbone perception with explicit specification verification, constraint-aware reasoning, and deterministic map editing under a bounded, verification-driven Judge-Planner-Worker loop. A vision-language Judge diagnoses errors by jointly inspecting visual evidence and draft vectors, while a tool-calling Planner generates minimal corrective edits with post-edit re-validation. To remain scalable for city-scale production, MapAgent is selectively triggered only on tiles with low backbone confidence, adding modest overhead while preserving throughput. Experiments on real-world datasets show consistent gains over strong production baselines, especially in complex and long-tail scenarios. Additionally, MapAgent has been integrated into Baidu Maps, supporting lane-level map generation for over 360 cities nationwide and elevating the overall production automation to over 95%, demonstrating MapAgent's practicality and effectiveness for large-scale lane-level map generation.