Canonical Latent Representations in Conditional Diffusion Models

📅 2025-06-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Conditional diffusion models (CDMs) suffer from entanglement between class-discriminative features and irrelevant background cues, leading to poor robustness and limited interpretability of latent representations. To address this, we propose Canonical Latent Representations (CLAReps), the first method to explicitly disentangle and refine class-semantic features in the CDM latent space while suppressing spurious background signals. Building upon CLAReps, we introduce CaDistill—a lightweight diffusion feature distillation framework that achieves effective knowledge transfer using only ~10% of the original training data. CaDistill integrates latent-space optimization, gradient-guided latent code search, and adversarial robustness evaluation. Experiments on CIFAR-10/100 demonstrate that student models trained with CaDistill achieve significant gains in both standard and adversarial accuracy, markedly reduce reliance on non-causal background cues, and validate CLAReps’ compactness, interpretability, and cross-task transferability.

Technology Category

Application Category

📝 Abstract

Conditional diffusion models (CDMs) have shown impressive performance across a range of generative tasks. Their ability to model the full data distribution has opened new avenues for analysis-by-synthesis in downstream discriminative learning. However, this same modeling capacity causes CDMs to entangle the class-defining features with irrelevant context, posing challenges to extracting robust and interpretable representations. To this end, we identify Canonical LAtent Representations (CLAReps), latent codes whose internal CDM features preserve essential categorical information while discarding non-discriminative signals. When decoded, CLAReps produce representative samples for each class, offering an interpretable and compact summary of the core class semantics with minimal irrelevant details. Exploiting CLAReps, we develop a novel diffusion-based feature-distillation paradigm, CaDistill. While the student has full access to the training set, the CDM as teacher transfers core class knowledge only via CLAReps, which amounts to merely 10 % of the training data in size. After training, the student achieves strong adversarial robustness and generalization ability, focusing more on the class signals instead of spurious background cues. Our findings suggest that CDMs can serve not just as image generators but also as compact, interpretable teachers that can drive robust representation learning.

Problem

Research questions and friction points this paper is trying to address.

CDMs entangle class features with irrelevant context

Need robust interpretable latent representations in CDMs

Distill core class knowledge efficiently from CDMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Canonical LAtent Representations (CLAReps) for discriminative features

Diffusion-based feature-distillation paradigm (CaDistill)

Compact interpretable teacher for robust learning

🔎 Similar Papers

No similar papers found.

Authors to Follow