🤖 AI Summary
Existing autonomous drones typically rely on external positioning systems (e.g., GPS, motion capture) or structured environments, limiting high-performance racing and real-world deployment in uninstrumented, complex outdoor settings. This paper introduces an end-to-end vision-based autonomous navigation framework integrating high-accuracy visual-inertial SLAM, deep reinforcement learning, and real-time trajectory planning—enabling robust, high-speed flight in GPS-denied, motion-capture-free, ground-truth-free野外 environments. To our knowledge, this is the first demonstration of autonomous drone racing performance under fully uninstrumented conditions, achieving speed and trajectory accuracy comparable to professional human pilots. We publicly release the first professional-grade real-world drone racing dataset, featuring synchronized multimodal data from both expert human pilots and autonomous systems. This dataset establishes a reproducible benchmark and technical foundation for vision-based autonomy in practical applications such as agriculture, logistics, and security.
📝 Abstract
Drone technology is proliferating in many industries, including agriculture, logistics, defense, infrastructure, and environmental monitoring. Vision-based autonomy is one of its key enablers, particularly for real-world applications. This is essential for operating in novel, unstructured environments where traditional navigation methods may be unavailable. Autonomous drone racing has become the de facto benchmark for such systems. State-of-the-art research has shown that autonomous systems can surpass human-level performance in racing arenas. However, direct applicability to commercial and field operations is still limited as current systems are often trained and evaluated in highly controlled environments. In our contribution, the system's capabilities are analyzed within a controlled environment -- where external tracking is available for ground-truth comparison -- but also demonstrated in a challenging, uninstrumented environment -- where ground-truth measurements were never available. We show that our approach can match the performance of professional human pilots in both scenarios. We also publicly release the data from the flights carried out by our approach and a world-class human pilot.