Data-free Knowledge Distillation with Diffusion Models

📅 2025-04-01

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Existing data-free knowledge distillation (DFKD) methods struggle to synthesize training samples that are both distribution-aligned and high-fidelity, limiting student model performance. To address this, we propose the first systematic integration of diffusion models into the DFKD framework, introducing a teacher-guided latent-space synthesis mechanism: a pre-trained diffusion model is steered by teacher network features to generate high-quality images directly in the latent space, augmented with latent-space CutMix to enhance sample diversity and distribution alignment. Our approach jointly optimizes generation quality, semantic consistency, and computational efficiency. Extensive experiments demonstrate significant improvements over state-of-the-art DFKD methods on standard benchmarks—including CIFAR-10, CIFAR-100, and ImageNet-1K—achieving new SOTA results. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

Recently Data-Free Knowledge Distillation (DFKD) has garnered attention and can transfer knowledge from a teacher neural network to a student neural network without requiring any access to training data. Although diffusion models are adept at synthesizing high-fidelity photorealistic images across various domains, existing methods cannot be easiliy implemented to DFKD. To bridge that gap, this paper proposes a novel approach based on diffusion models, DiffDFKD. Specifically, DiffDFKD involves targeted optimizations in two key areas. Firstly, DiffDFKD utilizes valuable information from teacher models to guide the pre-trained diffusion models' data synthesis, generating datasets that mirror the training data distribution and effectively bridge domain gaps. Secondly, to reduce computational burdens, DiffDFKD introduces Latent CutMix Augmentation, an efficient technique, to enhance the diversity of diffusion model-generated images for DFKD while preserving key attributes for effective knowledge transfer. Extensive experiments validate the efficacy of DiffDFKD, yielding state-of-the-art results exceeding existing DFKD approaches. We release our code at https://github.com/xhqi0109/DiffDFKD.

Problem

Research questions and friction points this paper is trying to address.

Bridging domain gaps in Data-Free Knowledge Distillation

Reducing computational costs in diffusion model synthesis

Enhancing image diversity for effective knowledge transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models for data-free knowledge transfer

Guides diffusion synthesis with teacher model information

Introduces Latent CutMix for efficient diversity enhancement

🔎 Similar Papers

Revisiting Knowledge Distillation under Distribution Shift