Deep Transformer Network for Monocular Pose Estimation of Ship-Based UAV

📅 2024-06-13
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Monocular visual landing of shipborne UAVs demands accurate and robust 6D relative pose estimation under severe challenges—including scarcity of real-world training data and complex, dynamic maritime illumination conditions. Method: This paper proposes a depth-enhanced Transformer-based framework for 6D relative pose estimation. To address data scarcity and illumination variability, the method leverages synthetically generated ship images for training, incorporates multi-part 2D keypoint detection, and introduces a Bayesian fusion strategy to enhance robustness against occlusion and noise. Crucially, it is the first work to adapt the Transformer architecture to monocular ship–UAV 6D pose estimation, enabling end-to-end joint optimization of keypoint localization and geometric constraints. Results: Evaluated on synthetic benchmarks and real flight experiments, the method achieves positional errors within 0.8% and 1.0% of the ship–UAV distance, respectively—demonstrating high accuracy, strong generalization to unseen scenarios, and practical deployability in real-world naval operations.

Technology Category

Application Category

📝 Abstract
This paper introduces a deep transformer network for estimating the relative 6D pose of a Unmanned Aerial Vehicle (UAV) with respect to a ship using monocular images. A synthetic dataset of ship images is created and annotated with 2D keypoints of multiple ship parts. A Transformer Neural Network model is trained to detect these keypoints and estimate the 6D pose of each part. The estimates are integrated using Bayesian fusion. The model is tested on synthetic data and in-situ flight experiments, demonstrating robustness and accuracy in various lighting conditions. The position estimation error is approximately 0.8% and 1.0% of the distance to the ship for the synthetic data and the flight experiments, respectively. The method has potential applications for ship-based autonomous UAV landing and navigation.
Problem

Research questions and friction points this paper is trying to address.

Estimating UAV 6D pose relative to ships using monocular images
Detecting ship part keypoints with Transformer Neural Network
Improving autonomous UAV landing and navigation accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep transformer network for 6D pose estimation
Synthetic dataset with annotated 2D keypoints
Bayesian fusion integrates keypoint estimates
🔎 Similar Papers
No similar papers found.
Maneesha Wickramasuriya
Maneesha Wickramasuriya
The George Washington University
T
Taeyoung Lee
The George Washington University, 800 22nd St NW, Washington DC 20052
M
Murray Snyder
The George Washington University, 800 22nd St NW, Washington DC 20052