🤖 AI Summary
To address the challenge of fine-grained UAV self-localization in GPS-denied urban environments, this paper proposes a cross-view image geolocalization method. The approach introduces three key innovations: (1) a dynamic negative sample mining strategy to enhance discriminative training efficiency; (2) a Rubik’s Cube Attention (RCA) module, inspired by Rubik’s cube rotations, to model multi-dimensional spatial interactions; and (3) a Context-Aware Channel Integration (CACI) mechanism to improve cross-view feature consistency. Evaluated on the DenseUAV dataset, the method achieves state-of-the-art performance. It also significantly outperforms existing approaches on the University-1652 benchmark—particularly in dynamic, heavily occluded urban scenes—demonstrating substantial improvements in localization accuracy under challenging real-world conditions.
📝 Abstract
Image retrieval has been employed as a robust complementary technique to address the challenge of Unmanned Aerial Vehicles (UAVs) self-positioning. However, most existing methods primarily focus on localizing objects captured by UAVs through complex part-based representations, often overlooking the unique challenges associated with UAV self-positioning, such as fine-grained spatial discrimination requirements and dynamic scene variations. To address the above issues, we propose the Context-Enhanced method for precise UAV Self-Positioning (CEUSP), specifically designed for UAV self-positioning tasks. CEUSP integrates a Dynamic Sampling Strategy (DSS) to efficiently select optimal negative samples, while the Rubik's Cube Attention (RCA) module, combined with the Context-Aware Channel Integration (CACI) module, enhances feature representation and discrimination by exploiting interdimensional interactions, inspired by the rotational mechanics of a Rubik's Cube. Extensive experimental validate the effectiveness of the proposed method, demonstrating notable improvements in feature representation and UAV self-positioning accuracy within complex urban environments. Our approach achieves state-of-the-art performance on the DenseUAV dataset, which is specifically designed for dense urban contexts, and also delivers competitive results on the widely recognized University-1652 benchmark.