Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection

๐Ÿ“… 2025-03-03
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses domain generalization for object detection (DG-OD) in the challenging zero-target-domain-data setting. We propose a diffusion-based dual-granularity alignment framework. Departing from conventional uses of diffusion models solely for generation, we pioneer the utilization of their intermediate denoising features as robust, domain-invariant feature extractors. Building upon this insight, we design a synergistic alignment mechanism operating jointly at the feature level and object levelโ€”enabling effective knowledge transfer without accessing target-domain data or incurring additional inference overhead. Our method achieves an average +14.0% mAP improvement across six standard DG-OD benchmarks, significantly outperforming existing domain generalization approaches and surpassing a strong baseline by +15.9% mAP. These results empirically validate the efficacy and practicality of diffusion model intermediate representations for enhancing generalization in open-world object detection.

Technology Category

Application Category

๐Ÿ“ Abstract
Domain generalization (DG) for object detection aims to enhance detectors' performance in unseen scenarios. This task remains challenging due to complex variations in real-world applications. Recently, diffusion models have demonstrated remarkable capabilities in diverse scene generation, which inspires us to explore their potential for improving DG tasks. Instead of generating images, our method extracts multi-step intermediate features during the diffusion process to obtain domain-invariant features for generalized detection. Furthermore, we propose an efficient knowledge transfer framework that enables detectors to inherit the generalization capabilities of diffusion models through feature and object-level alignment, without increasing inference time. We conduct extensive experiments on six challenging DG benchmarks. The results demonstrate that our method achieves substantial improvements of 14.0% mAP over existing DG approaches across different domains and corruption types. Notably, our method even outperforms most domain adaptation methods without accessing any target domain data. Moreover, the diffusion-guided detectors show consistent improvements of 15.9% mAP on average compared to the baseline. Our work aims to present an effective approach for domain-generalized detection and provide potential insights for robust visual recognition in real-world scenarios. The code is available at href{https://github.com/heboyong/Generalized-Diffusion-Detector}{Generalized Diffusion Detector}
Problem

Research questions and friction points this paper is trying to address.

Enhance object detection in unseen scenarios using domain generalization.
Extract domain-invariant features from diffusion models for robust detection.
Propose a knowledge transfer framework without increasing inference time.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts multi-step diffusion features for domain-invariant detection
Proposes efficient knowledge transfer without increasing inference time
Achieves 14.0% mAP improvement over existing DG methods
๐Ÿ”Ž Similar Papers
No similar papers found.
Boyong He
Boyong He
Xiamen University
CV
Yuxiang Ji
Yuxiang Ji
Xiamen University
Qianwen Ye
Qianwen Ye
Xiamen University
CV
Z
Zhuoyue Tan
Institute of Artificial Intelligence, Xiamen University
L
Liaoni Wu
Institute of Artificial Intelligence, Xiamen University, School of Aerospace Engineering, Xiamen University