🤖 AI Summary
This work addresses the challenge of simultaneously estimating drone pose and geolocating targets in GNSS-denied environments by proposing an end-to-end neural pixel-to-3D registration framework that directly aligns live video streams with georeferenced 3D maps, eliminating reliance on costly sensors or loosely coupled pipelines. The core innovations include a dual-thread rendering–localization architecture ensuring low latency and drift-free accuracy, zero-shot transfer enabled by a large-scale synthetically generated dataset with geometric annotations, and a Joint Neural-Guided Optimizer (JNGO) enhancing robustness under dynamic conditions. Evaluated on both public and custom benchmarks, the method outperforms existing approaches while achieving real-time performance (>25 FPS) on an NVIDIA Jetson Orin platform, demonstrating strong generalization and practical deployability.
📝 Abstract
We present PiLoT, a unified framework that tackles UAV-based ego and target geo-localization. Conventional approaches rely on decoupled pipelines that fuse GNSS and Visual-Inertial Odometry (VIO) for ego-pose estimation, and active sensors like laser rangefinders for target localization. However, these methods are susceptible to failure in GNSS-denied environments and incur substantial hardware costs and complexity. PiLoT breaks this paradigm by directly registering live video stream against a geo-referenced 3D map. To achieve robust, accurate, and real-time performance, we introduce three key contributions: 1) a Dual-Thread Engine that decouples map rendering from core localization thread, ensuring both low latency while maintaining drift-free accuracy; 2) a large-scale synthetic dataset with precise geometric annotations (camera pose, depth maps). This dataset enables the training of a lightweight network that generalizes in a zero-shot manner from simulation to real data; and 3) a Joint Neural-Guided Stochastic-Gradient Optimizer (JNGO) that achieves robust convergence even under aggressive motion. Evaluations on a comprehensive set of public and newly collected benchmarks show that PiLoT outperforms state-of-the-art methods while running over 25 FPS on NVIDIA Jetson Orin platform. Our code and dataset is available at: https://github.com/Choyaa/PiLoT.