🤖 AI Summary
This work addresses topological distortion and insufficient diversity in synthetic samples generated for data-free knowledge distillation. Methodologically, it freezes the teacher model and constructs class-specific PCA subspaces using only two real samples per class; these subspaces serve as structural priors embedded into the loss function of a generative adversarial network, jointly enforcing semantic alignment and manifold consistency during generator training. The key contribution is the first incorporation of low-dimensional manifold structure priors—extracted via PCA—into data-free distillation’s generative modeling, explicitly preserving intra-class topological consistency and inter-class discriminability. Experimental validation on MNIST demonstrates significant improvements in student model accuracy. Notably, even with this minimal structural prior, the framework generates high-fidelity and diverse synthetic data, outperforming existing data-free approaches.
📝 Abstract
We introduce C2G-KD, a data-free knowledge distillation framework where a class-conditional generator is trained to produce synthetic samples guided by a frozen teacher model and geometric constraints derived from PCA. The generator never observes real training data but instead learns to activate the teacher's output through a combination of semantic and structural losses. By constraining generated samples to lie within class-specific PCA subspaces estimated from as few as two real examples per class, we preserve topological consistency and diversity. Experiments on MNIST show that even minimal class structure is sufficient to bootstrap useful synthetic training pipelines.