Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector

📅 2024-10-28

🏛️ ACM Multimedia

📈 Citations: 2

✨ Influential: 0

🤖 AI Summary

In cross-domain object detection, distribution shifts between source and target domains severely degrade detector performance. To address this, we propose a fine-tuning-free, diffusion-model-driven domain adaptation method: a frozen pre-trained diffusion model serves as a fixed feature teacher to generate high-quality pseudo-labels on unlabeled target-domain data; these pseudo-labels, together with feature-level and knowledge-level distillation, jointly supervise the student detector’s training. This is the first work to leverage a frozen diffusion model as a feature teacher in cross-domain detection—achieving an effective balance between inference efficiency and generalization capability. Our approach consistently improves average mAP by 21.2% across six cross-domain benchmarks, surpassing state-of-the-art methods by 5.7%, and is compatible with diverse mainstream detector architectures.

Technology Category

Application Category

📝 Abstract

Object detectors often suffer a decrease in performance due to the large domain gap between the training data (source domain) and real-world data (target domain). Diffusion-based generative models have shown remarkable abilities in generating high-quality and diverse images, suggesting their potential for extracting valuable feature from various domains. To effectively leverage the cross-domain feature representation of diffusion models, in this paper, we train a detector with frozen-weight diffusion model on the source domain, then employ it as a teacher model to generate pseudo labels on the unlabeled target domain, which are used to guide the supervised learning of the student model on the target domain. We refer to this approach as Diffusion Domain Teacher (DDT). By employing this straightforward yet potent framework, we significantly improve cross-domain object detection performance without compromising the inference speed. Our method achieves an average mAP improvement of 21.2% compared to the baseline on 6 datasets from three common cross-domain detection benchmarks (Cross-Camera, Syn2Real, Real2Artistic), surpassing the current state-of-the-art (SOTA) methods by an average of 5.7% mAP. Furthermore, extensive experiments demonstrate that our method consistently brings improvements even in more powerful and complex models, highlighting broadly applicable and effective domain adaptation capability of our DDT.

Problem

Research questions and friction points this paper is trying to address.

Reducing performance drop in object detectors across domains

Leveraging diffusion models for cross-domain feature extraction

Improving domain adaptation without sacrificing inference speed

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses frozen-weight diffusion model as teacher

Generates pseudo labels for target domain

Improves cross-domain detection performance significantly

🔎 Similar Papers

No similar papers found.

Authors to Follow