🤖 AI Summary
This work addresses the vulnerability of dataset distillation to backdoor attacks, establishing—for the first time—theoretical connections between distillation and kernel methods, thereby exposing inherent defense blind spots in feature-space mapping. Leveraging this insight, we propose two optimization-driven trigger generation strategies: (1) exploiting low-rank sensitivity in the kernel-induced feature space to construct stealthy trigger patterns, and (2) applying kernel-gradient regularization to enhance resilience against detection and purification mechanisms (e.g., neuron activation analysis and spectral signature detection). Experiments demonstrate that our attack achieves >92% success rates across mainstream distillation algorithms—including DC and KIP—while effectively evading state-of-the-art detection and purification techniques. These results validate the robustness and stealthiness of our theoretically grounded approach, highlighting critical security implications for kernel-based distillation frameworks.
📝 Abstract
Dataset distillation offers a potential means to enhance data efficiency in deep learning. Recent studies have shown its ability to counteract backdoor risks present in original training samples. In this study, we delve into the theoretical aspects of backdoor attacks and dataset distillation based on kernel methods. We introduce two new theory-driven trigger pattern generation methods specialized for dataset distillation. Following a comprehensive set of analyses and experiments, we show that our optimization-based trigger design framework informs effective backdoor attacks on dataset distillation. Notably, datasets poisoned by our designed trigger prove resilient against conventional backdoor attack detection and mitigation methods. Our empirical results validate that the triggers developed using our approaches are proficient at executing resilient backdoor attacks.