🤖 AI Summary
Existing data-free knowledge distillation (DFKD) methods struggle to synthesize training samples that are both distribution-aligned and high-fidelity, limiting student model performance. To address this, we propose the first systematic integration of diffusion models into the DFKD framework, introducing a teacher-guided latent-space synthesis mechanism: a pre-trained diffusion model is steered by teacher network features to generate high-quality images directly in the latent space, augmented with latent-space CutMix to enhance sample diversity and distribution alignment. Our approach jointly optimizes generation quality, semantic consistency, and computational efficiency. Extensive experiments demonstrate significant improvements over state-of-the-art DFKD methods on standard benchmarks—including CIFAR-10, CIFAR-100, and ImageNet-1K—achieving new SOTA results. The source code is publicly available.
📝 Abstract
Recently Data-Free Knowledge Distillation (DFKD) has garnered attention and can transfer knowledge from a teacher neural network to a student neural network without requiring any access to training data. Although diffusion models are adept at synthesizing high-fidelity photorealistic images across various domains, existing methods cannot be easiliy implemented to DFKD. To bridge that gap, this paper proposes a novel approach based on diffusion models, DiffDFKD. Specifically, DiffDFKD involves targeted optimizations in two key areas. Firstly, DiffDFKD utilizes valuable information from teacher models to guide the pre-trained diffusion models' data synthesis, generating datasets that mirror the training data distribution and effectively bridge domain gaps. Secondly, to reduce computational burdens, DiffDFKD introduces Latent CutMix Augmentation, an efficient technique, to enhance the diversity of diffusion model-generated images for DFKD while preserving key attributes for effective knowledge transfer. Extensive experiments validate the efficacy of DiffDFKD, yielding state-of-the-art results exceeding existing DFKD approaches. We release our code at https://github.com/xhqi0109/DiffDFKD.