๐ค AI Summary
This work proposes Bearing-UAV, a purely vision-based cross-view navigation method that jointly predicts the absolute position and heading of a drone, explicitly modeling spatial relationships to overcome the limitations of conventional map-tile matching approaches. Existing methods face a trade-off between localization accuracy and storage overhead while neglecting heading information, rendering them ineffective under large viewpoint discrepancies and misalignment common in real-world scenarios. In contrast, Bearing-UAV integrates global and local structural features and leverages relative spatial encoding for end-to-end localization without requiring pre-stored map tiles. Evaluated on Bearing-UAV-90kโa newly curated multi-city benchmark with diverse terrainsโthe method significantly outperforms current matching- or retrieval-based approaches, enabling lightweight and robust navigation in GNSS-denied environments.
๐ Abstract
Recent advances in cross-view geo-localization (CVGL) methods have shown strong potential for supporting unmanned aerial vehicle (UAV) navigation in GNSS-denied environments. However, existing work predominantly focuses on matching UAV views to onboard map tiles, which introduces an inherent trade-off between accuracy and storage overhead, and overlooks the importance of the UAV's heading during navigation. Moreover, the substantial discrepancies and varying overlaps in cross-view scenarios have been insufficiently considered, limiting their generalization to real-world scenarios. In this paper, we present Bearing-UAV, a purely vision-driven cross-view navigation method that jointly predicts UAV absolute location and heading from neighboring features, enabling accurate, lightweight, and robust navigation in the wild. Our method leverages global and local structural features and explicitly encodes relative spatial relationships, making it robust to cross-view variations, misalignment, and feature-sparse conditions. We also present Bearing-UAV-90k, a multi-city benchmark for evaluating cross-view localization and navigation. Extensive experiments show encouraging results that Bearing-UAV yields lower localization error than previous matching/retrieval paradigm across diverse terrains. Our code and dataset will be made publicly available.