A Map-free Deep Learning-based Framework for Gate-to-Gate Monocular Visual Navigation aboard Miniaturized Aerial Vehicles

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the agile navigation requirements of sub-50g palm-sized nano-drones in racing scenarios, this work proposes a map-free, monocular vision–driven, fully onboard autonomous gate-passing framework—operating without GPS or external localization. Methodologically, it introduces the first integration of a lightweight deep-learning gate detector (an enhanced YOLO-Nano variant, 24M MACs/frame) with image-Jacobian–based visual servo control, synergized with hybrid simulation–real-world co-training and monocular geometric estimation. Evaluated on 20,000 real-world images, the detector achieves a gate localization RMSE of 1.4 pixels. In physical experiments, the system successfully completes collision-free consecutive traversal of 15 gates over a 100-m course at a peak speed of 1.9 m/s, sustaining closed-loop operation at 30 Hz for over four minutes. Key contributions include: (i) an ultra-low-power end-to-end visual navigation architecture; (ii) tight coupling of lightweight learning models with classical control; and (iii) full onboard deployment and validation on resource-constrained platforms.

Technology Category

Application Category

📝 Abstract
Palm-sized autonomous nano-drones, i.e., sub-50g in weight, recently entered the drone racing scenario, where they are tasked to avoid obstacles and navigate as fast as possible through gates. However, in contrast with their bigger counterparts, i.e., kg-scale drones, nano-drones expose three orders of magnitude less onboard memory and compute power, demanding more efficient and lightweight vision-based pipelines to win the race. This work presents a map-free vision-based (using only a monocular camera) autonomous nano-drone that combines a real-time deep learning gate detection front-end with a classic yet elegant and effective visual servoing control back-end, only relying on onboard resources. Starting from two state-of-the-art tiny deep learning models, we adapt them for our specific task, and after a mixed simulator-real-world training, we integrate and deploy them aboard our nano-drone. Our best-performing pipeline costs of only 24M multiply-accumulate operations per frame, resulting in a closed-loop control performance of 30 Hz, while achieving a gate detection root mean square error of 1.4 pixels, on our ~20k real-world image dataset. In-field experiments highlight the capability of our nano-drone to successfully navigate through 15 gates in 4 min, never crashing and covering a total travel distance of ~100m, with a peak flight speed of 1.9 m/s. Finally, to stress the generalization capability of our system, we also test it in a never-seen-before environment, where it navigates through gates for more than 4 min.
Problem

Research questions and friction points this paper is trying to address.

Develops lightweight vision-based navigation for nano-drones.
Combines deep learning gate detection with visual servoing control.
Achieves real-time performance with minimal onboard resources.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Map-free monocular visual navigation system
Real-time deep learning gate detection
Onboard visual servoing control integration
🔎 Similar Papers
No similar papers found.