Pseudo Multi-Source Domain Generalization: Bridging the Gap Between Single and Multi-Source Domain Generalization

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Deep learning models often suffer performance degradation under test-time distribution shifts, and while multi-source domain generalization (MDG) mitigates this, it requires costly, labor-intensive multi-domain datasets—making it inapplicable to single-source domain (SDG) settings. Method: We propose pseudo-multi-source domain generalization (PMDG), a framework that synthesizes multiple semantically consistent yet stylistically diverse pseudo-domains from a single source domain via style transfer (e.g., AdaIN) and diversity-aware data augmentation, enabling direct application of existing MDG algorithms (e.g., ERM, CORAL, Mixup-MDG). Contribution/Results: Evaluated on our benchmark PseudoDomainBed, PMDG significantly improves generalization from a single source; with sufficient pseudo-domains, it matches or surpasses real multi-source MDG performance; and MDG and PMDG exhibit strong positive correlation in efficacy. Code and benchmark are open-sourced to standardize SDG evaluation.

Technology Category

Application Category

📝 Abstract

Deep learning models often struggle to maintain performance when deployed on data distributions different from their training data, particularly in real-world applications where environmental conditions frequently change. While Multi-source Domain Generalization (MDG) has shown promise in addressing this challenge by leveraging multiple source domains during training, its practical application is limited by the significant costs and difficulties associated with creating multi-domain datasets. To address this limitation, we propose Pseudo Multi-source Domain Generalization (PMDG), a novel framework that enables the application of sophisticated MDG algorithms in more practical Single-source Domain Generalization (SDG) settings. PMDG generates multiple pseudo-domains from a single source domain through style transfer and data augmentation techniques, creating a synthetic multi-domain dataset that can be used with existing MDG algorithms. Through extensive experiments with PseudoDomainBed, our modified version of the DomainBed benchmark, we analyze the effectiveness of PMDG across multiple datasets and architectures. Our analysis reveals several key findings, including a positive correlation between MDG and PMDG performance and the potential of pseudo-domains to match or exceed actual multi-domain performance with sufficient data. These comprehensive empirical results provide valuable insights for future research in domain generalization. Our code is available at https://github.com/s-enmt/PseudoDomainBed.

Problem

Research questions and friction points this paper is trying to address.

Enhancing model performance across diverse data distributions

Overcoming single-source limitations via pseudo multi-domain generation

Enabling MDG algorithms in practical single-source settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates pseudo-domains via style transfer

Uses data augmentation for synthetic datasets

Enables MDG algorithms in SDG settings

🔎 Similar Papers

No similar papers found.

Authors to Follow