ATR-UMMIM: A Benchmark Dataset for UAV-Based Multimodal Image Registration under Complex Imaging Conditions

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of a dedicated multimodal image registration benchmark for unmanned aerial vehicle (UAV) aerial scenarios, this paper introduces UAV-MMR—the first publicly available benchmark designed for complex imaging conditions. The dataset comprises 7,969 visible–infrared–registered-visible image triplets, captured across multiple altitudes, viewing angles, and weather conditions. It features pixel-level registration ground truth and six-dimensional imaging condition attributes—novel annotations introduced herein—and employs a semi-automated pipeline to ensure high-precision labeling. Additionally, UAV-MMR provides 77,753 visible-light and 78,409 infrared bounding boxes across 11 object classes. This comprehensive annotation enables robust training and evaluation of multimodal registration, fusion, and detection algorithms, as well as downstream task research. UAV-MMR thus establishes a foundational resource for advancing multimodal perception in challenging UAV-based remote sensing applications.

Technology Category

Application Category

📝 Abstract
Multimodal fusion has become a key enabler for UAV-based object detection, as each modality provides complementary cues for robust feature extraction. However, due to significant differences in resolution, field of view, and sensing characteristics across modalities, accurate registration is a prerequisite before fusion. Despite its importance, there is currently no publicly available benchmark specifically designed for multimodal registration in UAV-based aerial scenarios, which severely limits the development and evaluation of advanced registration methods under real-world conditions. To bridge this gap, we present ATR-UMMIM, the first benchmark dataset specifically tailored for multimodal image registration in UAV-based applications. This dataset includes 7,969 triplets of raw visible, infrared, and precisely registered visible images captured covers diverse scenarios including flight altitudes from 80m to 300m, camera angles from 0° to 75°, and all-day, all-year temporal variations under rich weather and illumination conditions. To ensure high registration quality, we design a semi-automated annotation pipeline to introduce reliable pixel-level ground truth to each triplet. In addition, each triplet is annotated with six imaging condition attributes, enabling benchmarking of registration robustness under real-world deployment settings. To further support downstream tasks, we provide object-level annotations on all registered images, covering 11 object categories with 77,753 visible and 78,409 infrared bounding boxes. We believe ATR-UMMIM will serve as a foundational benchmark for advancing multimodal registration, fusion, and perception in real-world UAV scenarios. The datatset can be download from https://github.com/supercpy/ATR-UMMIM
Problem

Research questions and friction points this paper is trying to address.

Lack of benchmark for UAV multimodal image registration
Challenges in aligning diverse resolution and sensing modalities
Need for robust registration under real-world conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

First UAV multimodal registration benchmark dataset
Semi-automated annotation for precise ground truth
Includes diverse real-world imaging conditions
🔎 Similar Papers
No similar papers found.
K
Kangcheng Bin
School of Electronic Science and Technology, National University of Defense Technology, Changsha 410003, China
C
Chen Chen
School of Electronic Science and Technology, National University of Defense Technology, Changsha 410003, China
Ting Hu
Ting Hu
Associate Professor, School of Computing, Queen's University, Canada
Explainable AIEvolutionary ComputingMachine LearningBioinformatics
J
Jiahao Qi
School of Electronic Science and Technology, National University of Defense Technology, Changsha 410003, China
Ping Zhong
Ping Zhong
University of Houston