🤖 AI Summary
Existing dataset distillation methods neglect model robustness, resulting in distilled datasets that yield models highly vulnerable to adversarial attacks. This paper introduces the novel task of *robust dataset distillation*, aiming to synthesize compact datasets that simultaneously achieve high generalization performance and strong adversarial robustness. To this end, we propose *Matching Adversarial Trajectories* (MAT), a method that implicitly integrates adversarial training into gradient-matching-based distillation by aligning natural and adversarial optimization trajectories—enabling robustness to emerge directly from standard (non-adversarial) training. MAT jointly optimizes adversarial example generation and differentiable data synthesis. On CIFAR-10 and CIFAR-100, using only 0.1% of the original training samples, MAT-synthesized datasets improve the adversarial accuracy of naturally trained models by 12.3% while achieving state-of-the-art standard (clean) accuracy.
📝 Abstract
Dataset distillation synthesizes compact datasets that enable models to achieve performance comparable to training on the original large-scale datasets. However, existing distillation methods overlook the robustness of the model, resulting in models that are vulnerable to adversarial attacks when trained on distilled data. To address this limitation, we introduce the task of ``robust dataset distillation", a novel paradigm that embeds adversarial robustness into the synthetic datasets during the distillation process. We propose Matching Adversarial Trajectories (MAT), a method that integrates adversarial training into trajectory-based dataset distillation. MAT incorporates adversarial samples during trajectory generation to obtain robust training trajectories, which are then used to guide the distillation process. As experimentally demonstrated, even through natural training on our distilled dataset, models can achieve enhanced adversarial robustness while maintaining competitive accuracy compared to existing distillation methods. Our work highlights robust dataset distillation as a new and important research direction and provides a strong baseline for future research to bridge the gap between efficient training and adversarial robustness.