ALDI-ray: Adapting the ALDI Framework for Security X-ray Object Detection

📅 2025-12-02

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

To address domain shift in security X-ray images caused by variations in scanning equipment and imaging environments, this paper proposes a cross-domain object detection method built upon the ALDI++ framework. The approach adopts ViTDet as the backbone network and—novelly for security X-ray detection—integrates self-distillation, feature-level domain alignment, and enhanced training strategies to improve generalization to unseen domains. Evaluated on the EDS dataset, our method achieves the highest mAP among existing approaches, with significant and consistent accuracy improvements across all object categories. Key contributions include: (1) extending ALDI++ to the X-ray security screening domain; (2) empirically validating the effectiveness of Transformer-based architectures for cross-domain X-ray detection; and (3) establishing an end-to-end trainable domain adaptive detection pipeline. The proposed framework bridges critical gaps in domain robustness for real-world security inspection systems.

Technology Category

Application Category

📝 Abstract

Domain adaptation in object detection is critical for real-world applications where distribution shifts degrade model performance. Security X-ray imaging presents a unique challenge due to variations in scanning devices and environmental conditions, leading to significant domain discrepancies. To address this, we apply ALDI++, a domain adaptation framework that integrates self-distillation, feature alignment, and enhanced training strategies to mitigate domain shift effectively in this area. We conduct extensive experiments on the EDS dataset, demonstrating that ALDI++ surpasses the state-of-the-art (SOTA) domain adaptation methods across multiple adaptation scenarios. In particular, ALDI++ with a Vision Transformer for Detection (ViTDet) backbone achieves the highest mean average precision (mAP), confirming the effectiveness of transformer-based architectures for cross-domain object detection. Additionally, our category-wise analysis highlights consistent improvements in detection accuracy, reinforcing the robustness of the model across diverse object classes. Our findings establish ALDI++ as an efficient solution for domain-adaptive object detection, setting a new benchmark for performance stability and cross-domain generalization in security X-ray imagery.

Problem

Research questions and friction points this paper is trying to address.

Addresses domain shift in security X-ray object detection.

Improves cross-domain generalization with ALDI++ framework.

Enhances detection accuracy across diverse object classes.

Innovation

Methods, ideas, or system contributions that make the work stand out.

ALDI++ integrates self-distillation and feature alignment

It uses Vision Transformer for Detection backbone

It achieves high mAP in cross-domain object detection

🔎 Similar Papers

Multi-modal vision-language model for generalizable annotation-free pathology localization and clinical diagnosis