🤖 AI Summary
Monocular visual pose estimation for unmanned aerial vehicles (UAVs) in maritime environments faces challenges in real-world validation due to reliance on expensive research vessels and degraded GPS performance. Method: This paper introduces the first high-fidelity, vision-control closed-loop simulation environment specifically designed for shipboard UAV autonomous visual landing. It innovatively integrates Gaussian Splatting—originally a neural radiance field acceleration technique—into dynamic maritime scene modeling, enabling end-to-end generation of lightweight, high-quality 3D reconstructions from multi-view real-world imagery for training and closed-loop evaluation of depth-aware pose estimation networks. A Transformer-based monocular visual pose estimation algorithm is co-validated with flight control hardware and software in the simulated environment. Contribution/Results: The framework significantly reduces dependence on costly at-sea trials, accelerates development cycles, and enhances the reliability and robustness of vision-based autonomous landing algorithms under complex maritime conditions.
📝 Abstract
This paper proposes a vision-in-the-loop simulation environment for deep monocular pose estimation of a UAV operating in an ocean environment. Recently, a deep neural network with a transformer architecture has been successfully trained to estimate the pose of a UAV relative to the flight deck of a research vessel, overcoming several limitations of GPS-based approaches. However, validating the deep pose estimation scheme in an actual ocean environment poses significant challenges due to the limited availability of research vessels and the associated operational costs. To address these issues, we present a photo-realistic 3D virtual environment leveraging recent advancements in Gaussian splatting, a novel technique that represents 3D scenes by modeling image pixels as Gaussian distributions in 3D space, creating a lightweight and high-quality visual model from multiple viewpoints. This approach enables the creation of a virtual environment integrating multiple real-world images collected in situ. The resulting simulation enables the indoor testing of flight maneuvers while verifying all aspects of flight software, hardware, and the deep monocular pose estimation scheme. This approach provides a cost-effective solution for testing and validating the autonomous flight of shipboard UAVs, specifically focusing on vision-based control and estimation algorithms.