A Unified View of Drifting and Score-Based Models

📅 2026-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work establishes a rigorous theoretical connection between drift models and score-based generative models through the lens of score matching under kernel-smoothed distributions. It demonstrates, for the first time, that the drift field induced by a Gaussian kernel is equivalent to the difference between the scores of the smoothed data distribution and the model distribution—a result extended to general radial kernels. By integrating Tweedie’s formula, kernel density estimation, and high-dimensional probabilistic analysis, the study further reveals an intrinsic link to the Distributional Matching Distillation (DMD) framework. Moreover, it provides sharp error bounds for the Laplace kernel in low-temperature and high-dimensional regimes, proving its ability to accurately approximate the score-matching objective under these challenging conditions.

Technology Category

Application Category

📝 Abstract
Drifting models train one-step generators by optimizing a mean-shift discrepancy induced by a kernel between the data and model distributions, with Laplace kernels used by default in practice. At each point, this discrepancy compares the kernel-weighted displacement toward nearby data samples with the corresponding displacement toward nearby model samples, yielding a transport direction for generated samples. In this paper, we make its relationship to the score-matching principle behind diffusion models precise by showing that drifting admits a score-based formulation on kernel-smoothed distributions. For Gaussian kernels, the population mean-shift field coincides with the score difference between the Gaussian-smoothed data and model distributions. This identity follows from Tweedie's formula, which links the score of a Gaussian-smoothed density to the corresponding conditional mean, and implies that Gaussian-kernel drifting is exactly a score-matching-style objective on smoothed distributions. It also clarifies the connection to Distribution Matching Distillation (DMD): both methods use score-mismatch transport directions, but drifting realizes the score signal nonparametrically from kernel neighborhoods, whereas DMD uses a pretrained diffusion teacher. Beyond Gaussians, we derive an exact decomposition for general radial kernels, and for the Laplace kernel we prove rigorous error bounds showing that drifting remains an accurate proxy for score matching in low-temperature and high-dimensional regimes.
Problem

Research questions and friction points this paper is trying to address.

drifting models
score-based models
kernel smoothing
score matching
mean-shift discrepancy
Innovation

Methods, ideas, or system contributions that make the work stand out.

drifting models
score-based generative models
kernel smoothing
Tweedie's formula
distribution matching