🤖 AI Summary
In unsupervised domain adaptive object detection (DAOD), large distribution shifts between source domains (e.g., synthetic or normal-weather images) and target domains (e.g., adverse weather or low-light scenes), coupled with the reliance of existing methods on complex adversarial training or auxiliary models, hinder practical deployment. To address this, we propose a lightweight frequency-domain style transfer framework. Our key innovation is the first introduction of a phase-guided amplitude spectrum generation mechanism: in the Fourier domain, it adaptively modulates the source image’s amplitude spectrum to align with target-domain statistical characteristics while strictly preserving the source’s phase structure to maintain geometric consistency. The method requires only a single learnable frequency-domain preprocessing module and incurs zero inference overhead after training. Extensive experiments on multiple DAOD benchmarks demonstrate significant performance gains—particularly under adverse weather and low-light conditions—achieving mAP improvements of 3.2–5.8%. This validates the framework’s simplicity, effectiveness, and practicality.
📝 Abstract
Unsupervised domain adaptation (UDA) greatly facilitates the deployment of neural networks across diverse environments. However, most state-of-the-art approaches are overly complex, relying on challenging adversarial training strategies, or on elaborate architectural designs with auxiliary models for feature distillation and pseudo-label generation. In this work, we present a simple yet effective UDA method that learns to adapt image styles in the frequency domain to reduce the discrepancy between source and target domains. The proposed approach introduces only a lightweight pre-processing module during training and entirely discards it at inference time, thus incurring no additional computational overhead. We validate our method on domain-adaptive object detection (DAOD) tasks, where ground-truth annotations are easily accessible in source domains (e.g., normal-weather or synthetic conditions) but challenging to obtain in target domains (e.g., adverse weather or low-light scenes). Extensive experiments demonstrate that our method achieves substantial performance gains on multiple benchmarks, highlighting its practicality and effectiveness.