MapAnything: Mapping Urban Assets using Single Street-View Images

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Urban databases require continuous updates of semantic and geospatial information for infrastructure elements (e.g., traffic signs, trees, graffiti, road damage), yet conventional manual data collection is costly and inefficient. To address this, we propose a single-image-based geolocalization method for urban assets and events, leveraging monocular street-view imagery alone. Our approach integrates a metric depth estimation model, calibrated camera intrinsic and extrinsic parameters, geometric projection principles, and LiDAR point cloud–aided distance calibration to directly infer 3D object positions and project them into a geographic coordinate system. This work presents the first single-image-driven, city-scale semantic object georeferencing framework—requiring neither multi-view geometry nor image sequences. Experiments demonstrate controlled localization errors for traffic signs and road damage, alongside strong generalization across road and vegetation regions. The method significantly reduces human labor and temporal overhead in urban database maintenance.

Technology Category

Application Category

📝 Abstract

To maintain an overview of urban conditions, city administrations manage databases of objects like traffic signs and trees, complete with their geocoordinates. Incidents such as graffiti or road damage are also relevant. As digitization increases, so does the need for more data and up-to-date databases, requiring significant manual effort. This paper introduces MapAnything, a module that automatically determines the geocoordinates of objects using individual images. Utilizing advanced Metric Depth Estimation models, MapAnything calculates geocoordinates based on the object's distance from the camera, geometric principles, and camera specifications. We detail and validate the module, providing recommendations for automating urban object and incident mapping. Our evaluation measures the accuracy of estimated distances against LiDAR point clouds in urban environments, analyzing performance across distance intervals and semantic areas like roads and vegetation. The module's effectiveness is demonstrated through practical use cases involving traffic signs and road damage.

Problem

Research questions and friction points this paper is trying to address.

Automatically geolocating urban objects from single street-view images

Reducing manual effort in maintaining updated urban asset databases

Validating coordinate accuracy against LiDAR data across environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Metric Depth Estimation models

Calculates geocoordinates from object distance

Validates accuracy against LiDAR point clouds

🔎 Similar Papers

No similar papers found.

Authors to Follow