Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector

📅 2024-10-28
🏛️ ACM Multimedia
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
In cross-domain object detection, distribution shifts between source and target domains severely degrade detector performance. To address this, we propose a fine-tuning-free, diffusion-model-driven domain adaptation method: a frozen pre-trained diffusion model serves as a fixed feature teacher to generate high-quality pseudo-labels on unlabeled target-domain data; these pseudo-labels, together with feature-level and knowledge-level distillation, jointly supervise the student detector’s training. This is the first work to leverage a frozen diffusion model as a feature teacher in cross-domain detection—achieving an effective balance between inference efficiency and generalization capability. Our approach consistently improves average mAP by 21.2% across six cross-domain benchmarks, surpassing state-of-the-art methods by 5.7%, and is compatible with diverse mainstream detector architectures.

Technology Category

Application Category

📝 Abstract
Object detectors often suffer a decrease in performance due to the large domain gap between the training data (source domain) and real-world data (target domain). Diffusion-based generative models have shown remarkable abilities in generating high-quality and diverse images, suggesting their potential for extracting valuable feature from various domains. To effectively leverage the cross-domain feature representation of diffusion models, in this paper, we train a detector with frozen-weight diffusion model on the source domain, then employ it as a teacher model to generate pseudo labels on the unlabeled target domain, which are used to guide the supervised learning of the student model on the target domain. We refer to this approach as Diffusion Domain Teacher (DDT). By employing this straightforward yet potent framework, we significantly improve cross-domain object detection performance without compromising the inference speed. Our method achieves an average mAP improvement of 21.2% compared to the baseline on 6 datasets from three common cross-domain detection benchmarks (Cross-Camera, Syn2Real, Real2Artistic), surpassing the current state-of-the-art (SOTA) methods by an average of 5.7% mAP. Furthermore, extensive experiments demonstrate that our method consistently brings improvements even in more powerful and complex models, highlighting broadly applicable and effective domain adaptation capability of our DDT.
Problem

Research questions and friction points this paper is trying to address.

Reducing performance drop in object detectors across domains
Leveraging diffusion models for cross-domain feature extraction
Improving domain adaptation without sacrificing inference speed
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses frozen-weight diffusion model as teacher
Generates pseudo labels for target domain
Improves cross-domain detection performance significantly
🔎 Similar Papers
No similar papers found.
Boyong He
Boyong He
Xiamen University
CV
Yuxiang Ji
Yuxiang Ji
Xiamen University
Z
Zhuoyue Tan
Xiamen University, Institute of Artificial Intelligence, Xiamen, China
L
Liaoni Wu
Xiamen University, Institute of Artificial Intelligence, School of Aerospace Engineering, Xiamen, China