Restoring Initial Noise Sensitivity in Text-to-Image Distillation via Geometric Alignment

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

This work addresses a critical limitation in existing text-to-image (T2I) distillation methods, which often sacrifice sensitivity to initial noise—thereby undermining downstream noise-dependent control tasks—while optimizing for efficiency and fidelity. To remedy this, the authors propose Geometric-Aware Distillation (GAD), a novel framework that, for the first time, incorporates local geometric structure alignment into the distillation process. GAD aligns the local functional behaviors of teacher and student models by matching their Jacobian-vector products with respect to input noise and further enhances alignment through multi-step trajectory compression. This approach effectively restores the student model’s sensitivity to initial noise, achieving a favorable balance between generation diversity and visual fidelity across diverse T2I paradigms, and consistently improves performance on various generative models and noise-driven control tasks.

📝 Abstract

Generative distillation significantly accelerates text-to-image (T2I) generation by compressing multi-step trajectories into few-step student models while preserving perceptual quality. However, existing methods primarily optimize efficiency and output fidelity, often neglecting critical properties of the original trajectory. In this work, we identify a key missing property: sensitivity to initial noise, whose degradation impairs downstream control methods relying on noise-based optimization and manipulation. We trace this issue to standard distillation objectives that enforce pointwise output alignment, inadvertently flattening the input-output landscape and suppressing the teacher's local geometric structure. To address this, we propose Geometry-Aware Distillation (GAD), a sensitivity-preserving framework that aligns the local functional behavior of teacher and student models. Specifically, GAD matches Jacobian-vector products with respect to input noise, enabling the student to reproduce the teacher's differential response to perturbations. Extensive experiments across multiple T2I paradigms and noise-driven control tasks demonstrate that GAD significantly restores sensitivity and improves diversity while maintaining high visual fidelity. Code is available at https://github.com/Hannah1102/GAD.

Problem

Research questions and friction points this paper is trying to address.

noise sensitivity

text-to-image distillation

generative distillation

initial noise

downstream control

Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometry-Aware Distillation

Noise Sensitivity

Text-to-Image Generation