Deep Transformer Network for Monocular Pose Estimation of Ship-Based UAV

📅 2024-06-13

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

Monocular visual landing of shipborne UAVs demands accurate and robust 6D relative pose estimation under severe challenges—including scarcity of real-world training data and complex, dynamic maritime illumination conditions. Method: This paper proposes a depth-enhanced Transformer-based framework for 6D relative pose estimation. To address data scarcity and illumination variability, the method leverages synthetically generated ship images for training, incorporates multi-part 2D keypoint detection, and introduces a Bayesian fusion strategy to enhance robustness against occlusion and noise. Crucially, it is the first work to adapt the Transformer architecture to monocular ship–UAV 6D pose estimation, enabling end-to-end joint optimization of keypoint localization and geometric constraints. Results: Evaluated on synthetic benchmarks and real flight experiments, the method achieves positional errors within 0.8% and 1.0% of the ship–UAV distance, respectively—demonstrating high accuracy, strong generalization to unseen scenarios, and practical deployability in real-world naval operations.

Technology Category

Application Category

📝 Abstract

This paper introduces a deep transformer network for estimating the relative 6D pose of a Unmanned Aerial Vehicle (UAV) with respect to a ship using monocular images. A synthetic dataset of ship images is created and annotated with 2D keypoints of multiple ship parts. A Transformer Neural Network model is trained to detect these keypoints and estimate the 6D pose of each part. The estimates are integrated using Bayesian fusion. The model is tested on synthetic data and in-situ flight experiments, demonstrating robustness and accuracy in various lighting conditions. The position estimation error is approximately 0.8% and 1.0% of the distance to the ship for the synthetic data and the flight experiments, respectively. The method has potential applications for ship-based autonomous UAV landing and navigation.

Problem

Research questions and friction points this paper is trying to address.

Estimating UAV 6D pose relative to ships using monocular images

Detecting ship part keypoints with Transformer Neural Network

Improving autonomous UAV landing and navigation accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep transformer network for 6D pose estimation

Synthetic dataset with annotated 2D keypoints

Bayesian fusion integrates keypoint estimates

🔎 Similar Papers

No similar papers found.

Authors to Follow