Zero-Shot Sim-to-Real Visual Quadrotor Control with Hard Constraints

πŸ“… 2025-03-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses zero-shot visual navigation for quadrotors transferring from simulation to reality without real-data fine-tuning. We propose a NeRF-driven sim-to-real end-to-end control framework: (1) a vision-only policy trained in FalconGymβ€”a high-fidelity NeRF-based simulator; (2) a neural pose estimator fused with Kalman filtering to enhance robustness of tightly coupled IMU-RGB perception; and (3) a self-attention-based multimodal controller improving adaptability under missing gate calibration. In real-world evaluation, the system achieves 95.8% success rate across 30 flights traversing 120 racing gates, with a mean localization error of only 10 cm (gate radius: 38 cm). In simulation, it outperforms state-of-the-art vision-only methods on three distinct race tracks. To our knowledge, this is the first demonstration of zero-shot visual gate traversal from NeRF-simulated to real-world racing environments.

Technology Category

Application Category

πŸ“ Abstract
We present the first framework demonstrating zero-shot sim-to-real transfer of visual control policies learned in a Neural Radiance Field (NeRF) environment for quadrotors to fly through racing gates. Robust transfer from simulation to real flight poses a major challenge, as standard simulators often lack sufficient visual fidelity. To address this, we construct a photorealistic simulation environment of quadrotor racing tracks, called FalconGym, which provides effectively unlimited synthetic images for training. Within FalconGym, we develop a pipelined approach for crossing gates that combines (i) a Neural Pose Estimator (NPE) coupled with a Kalman filter to reliably infer quadrotor poses from single-frame RGB images and IMU data, and (ii) a self-attention-based multi-modal controller that adaptively integrates visual features and pose estimation. This multi-modal design compensates for perception noise and intermittent gate visibility. We train this controller purely in FalconGym with imitation learning and deploy the resulting policy to real hardware with no additional fine-tuning. Simulation experiments on three distinct tracks (circle, U-turn and figure-8) demonstrate that our controller outperforms a vision-only state-of-the-art baseline in both success rate and gate-crossing accuracy. In 30 live hardware flights spanning three tracks and 120 gates, our controller achieves a 95.8% success rate and an average error of just 10 cm when flying through 38 cm-radius gates.
Problem

Research questions and friction points this paper is trying to address.

Zero-shot sim-to-real transfer for quadrotor visual control
Overcoming visual fidelity limitations in standard simulators
Achieving robust flight through gates using multi-modal control
Innovation

Methods, ideas, or system contributions that make the work stand out.

NeRF-based photorealistic simulation for training
Neural Pose Estimator with Kalman filter
Self-attention multi-modal controller integration
πŸ”Ž Similar Papers
No similar papers found.