DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture

📅 2024-09-05
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models (DMs) require large-scale real data for training, incurring prohibitive data acquisition costs. To address this, we propose the first “data-free knowledge distillation” paradigm: leveraging a pre-trained DM as the teacher, we distill its generative capability into an arbitrary-architecture student model without accessing any original training data. Methodologically, we design the Denoising Knowledge Distillation Metric (DKDM) objective, integrating dynamic timestep sampling and latent-space gradient backpropagation to reconstruct high-fidelity pseudo-data, enabling efficient temporal-domain knowledge transfer. Experiments demonstrate that, under zero-original-data conditions, our student models achieve significantly superior performance over existing data-free approaches in standard metrics—including FID and LPIPS—and match or even surpass fully data-trained baselines in generation quality.

Technology Category

Application Category

📝 Abstract
Diffusion models (DMs) have demonstrated exceptional generative capabilities across various domains, including image, video, and so on. A key factor contributing to their effectiveness is the high quantity and quality of data used during training. However, mainstream DMs now consume increasingly large amounts of data. For example, training a Stable Diffusion model requires billions of image-text pairs. This enormous data requirement poses significant challenges for training large DMs due to high data acquisition costs and storage expenses. To alleviate this data burden, we propose a novel scenario: using existing DMs as data sources to train new DMs with any architecture. We refer to this scenario as Data-Free Knowledge Distillation for Diffusion Models (DKDM), where the generative ability of DMs is transferred to new ones in a data-free manner. To tackle this challenge, we make two main contributions. First, we introduce a DKDM objective that enables the training of new DMs via distillation, without requiring access to the data. Second, we develop a dynamic iterative distillation method that efficiently extracts time-domain knowledge from existing DMs, enabling direct retrieval of training data without the need for a prolonged generative process. To the best of our knowledge, we are the first to explore this scenario. Experimental results demonstrate that our data-free approach not only achieves competitive generative performance but also, in some instances, outperforms models trained with the entire dataset.
Problem

Research questions and friction points this paper is trying to address.

Reduces data dependency in training diffusion models
Enables knowledge transfer without original training data
Introduces efficient distillation for model architecture flexibility
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-free knowledge distillation for diffusion models
Dynamic iterative distillation method
Training new models without original data
🔎 Similar Papers
No similar papers found.
Q
Qianlong Xiang
Harbin Institute of Technology (Shenzhen)
M
Miao Zhang
Harbin Institute of Technology (Shenzhen)
Yuzhang Shang
Yuzhang Shang
Assistant Professor at University of Central Florida
Efficient/Scalable AIDeep LearningMLCVNLP
Jianlong Wu
Jianlong Wu
Professor, Harbin Institute of Technology (Shenzhen)
Computer VisionMultimodal Learning
Y
Yan Yan
Illinois Institute of Technology
L
Liqiang Nie
Harbin Institute of Technology (Shenzhen)