🤖 AI Summary
Unsupervised domain adaptation (UDA) suffers from a critical lack of adversarial robustness, and vanilla adversarial training (e.g., VAT) fails under UDA due to unaddressed interactions between domain shift and adversarial perturbations.
Method: We establish the first generalization error upper bound that jointly accounts for domain shift and adversarial perturbations, revealing the theoretical root cause of this failure. Based on this analysis, we propose Unsupervised Robust Domain Adaptation (URDA)—a novel paradigm—and design DART, a lightweight, decoupled adversarial robustness training algorithm. DART employs two-stage decoupled knowledge distillation and instantaneous robustification, optimizing transferability and robustness synergistically without modifying the backbone architecture.
Results: Evaluated on four standard benchmarks, DART significantly improves adversarial robustness both under attacks and in clean conditions, while maintaining state-of-the-art domain adaptation accuracy—validating the theoretical soundness and practical efficacy of URDA.
📝 Abstract
Unsupervised domain adaptation (UDA) aims to transfer knowledge from a label-rich source domain to an unlabeled target domain by addressing domain shifts. Most UDA approaches emphasize transfer ability, but often overlook robustness against adversarial attacks. Although vanilla adversarial training (VAT) improves the robustness of deep neural networks, it has little effect on UDA. This paper focuses on answering three key questions: 1) Why does VAT, known for its defensive effectiveness, fail in the UDA paradigm? 2) What is the generalization bound theory under attacks and how does it evolve from classical UDA theory? 3) How can we implement a robustification training procedure without complex modifications? Specifically, we explore and reveal the inherent entanglement challenge in general UDA+VAT paradigm, and propose an unsupervised robust domain adaptation (URDA) paradigm. We further derive the generalization bound theory of the URDA paradigm so that it can resist adversarial noise and domain shift. To the best of our knowledge, this is the first time to establish the URDA paradigm and theory. We further introduce a simple, novel yet effective URDA algorithm called Disentangled Adversarial Robustness Training (DART), a two-step training procedure that ensures both transferability and robustness. DART first pre-trains an arbitrary UDA model, and then applies an instantaneous robustification post-training step via disentangled distillation.Experiments on four benchmark datasets with/without attacks show that DART effectively enhances robustness while maintaining domain adaptability, and validate the URDA paradigm and theory.