SAGA: Semantic-Aware Gray color Augmentation for Visible-to-Thermal Domain Adaptation across Multi-View Drone and Ground-Based Vision Systems

πŸ“… 2025-04-22
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In RGB-to-thermal infrared (IR) unsupervised domain adaptation (UDA) object detection, poor pseudo-label quality and high false-positive rates arise from the absence of color and texture cues in IR imagery. To address this, we propose Semantic-Aware Grayscale Augmentation (SAGA): it extracts object-level IR-specific semantic features and employs semantic-segmentation-guided grayscale mapping to mitigate inter-modal color bias. Additionally, we introduce IndraEyeβ€”the first multi-view, multi-temporal RGB-IR paired dataset tailored for airborne-ground collaborative perception. Integrated into the UDA-Faster R-CNN framework with high-precision multi-sensor calibration, SAGA achieves mAP improvements of 0.4–7.6% on both autonomous driving benchmarks and IndraEye, significantly enhancing pseudo-label reliability and cross-domain detection robustness. The code and dataset are publicly released.

Technology Category

Application Category

πŸ“ Abstract
Domain-adaptive thermal object detection plays a key role in facilitating visible (RGB)-to-thermal (IR) adaptation by reducing the need for co-registered image pairs and minimizing reliance on large annotated IR datasets. However, inherent limitations of IR images, such as the lack of color and texture cues, pose challenges for RGB-trained models, leading to increased false positives and poor-quality pseudo-labels. To address this, we propose Semantic-Aware Gray color Augmentation (SAGA), a novel strategy for mitigating color bias and bridging the domain gap by extracting object-level features relevant to IR images. Additionally, to validate the proposed SAGA for drone imagery, we introduce the IndraEye, a multi-sensor (RGB-IR) dataset designed for diverse applications. The dataset contains 5,612 images with 145,666 instances, captured from diverse angles, altitudes, backgrounds, and times of day, offering valuable opportunities for multimodal learning, domain adaptation for object detection and segmentation, and exploration of sensor-specific strengths and weaknesses. IndraEye aims to enhance the development of more robust and accurate aerial perception systems, especially in challenging environments. Experimental results show that SAGA significantly improves RGB-to-IR adaptation for autonomous driving and IndraEye dataset, achieving consistent performance gains of +0.4% to +7.6% (mAP) when integrated with state-of-the-art domain adaptation techniques. The dataset and codes are available at https://github.com/airliisc/IndraEye.
Problem

Research questions and friction points this paper is trying to address.

Bridging RGB-to-thermal domain gap for object detection
Addressing color and texture limitations in IR images
Enhancing drone-based perception with multi-sensor datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-Aware Gray color Augmentation (SAGA) for domain adaptation
IndraEye multi-sensor dataset for RGB-IR validation
Improves RGB-to-IR adaptation with +0.4% to +7.6% mAP
πŸ”Ž Similar Papers
No similar papers found.
D
D. Manjunath
Department of Aerospace Engineering, Indian Institute of Science, Bengaluru, India
Aniruddh Sikdar
Aniruddh Sikdar
Robert Bosch Centre for Cyber Physical Systems , Indian Institute of Science
Machine learningDeep learningComputer Vision
Prajwal Gurunath
Prajwal Gurunath
Robotics Graduate Student at CMU
Computer VisionDeep LearningRoboticsHumanoid
Sumanth Udupa
Sumanth Udupa
University of Queensland
Machine LearningDeep LearningComputer VisionRoboticsSignal Processing
S
Suresh Sundaram
Department of Aerospace Engineering, Indian Institute of Science, Bengaluru, India; Robert Bosch Centre for Cyber Physical Systems, Indian Institute of Science, Bengaluru, India