When and how can inexact generative models still sample from the data manifold?

📅 2025-08-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates why generated samples remain confined to the data manifold’s support—even under errors in the score function or drift field—despite the absence of explicit manifold constraints. Method: We identify alignment between the Lyapunov vector field and the tangent space of the data manifold as the key dynamical mechanism ensuring robustness, derive sufficient conditions for its realization, and propose a dynamics-based method to automatically estimate the tangent bundle. Our unified framework integrates stochastic and deterministic generative processes via dynamical systems theory, probability flow analysis, and finite-time linear perturbation theory. Contribution/Results: We theoretically characterize the intrinsic reason score-based and flow-matching models preserve manifold confinement under perturbations. The analysis applies broadly—both when the data distribution strictly adheres to a low-dimensional manifold and when it does not—thereby establishing a new paradigm for stability and interpretability in generative modeling.

Technology Category

Application Category

📝 Abstract
A curious phenomenon observed in some dynamical generative models is the following: despite learning errors in the score function or the drift vector field, the generated samples appear to shift emph{along} the support of the data distribution but not emph{away} from it. In this work, we investigate this phenomenon of emph{robustness of the support} by taking a dynamical systems approach on the generating stochastic/deterministic process. Our perturbation analysis of the probability flow reveals that infinitesimal learning errors cause the predicted density to be different from the target density only on the data manifold for a wide class of generative models. Further, what is the dynamical mechanism that leads to the robustness of the support? We show that the alignment of the top Lyapunov vectors (most sensitive infinitesimal perturbation directions) with the tangent spaces along the boundary of the data manifold leads to robustness and prove a sufficient condition on the dynamics of the generating process to achieve this alignment. Moreover, the alignment condition is efficient to compute and, in practice, for robust generative models, automatically leads to accurate estimates of the tangent bundle of the data manifold. Using a finite-time linear perturbation analysis on samples paths as well as probability flows, our work complements and extends existing works on obtaining theoretical guarantees for generative models from a stochastic analysis, statistical learning and uncertainty quantification points of view. Our results apply across different dynamical generative models, such as conditional flow-matching and score-based generative models, and for different target distributions that may or may not satisfy the manifold hypothesis.
Problem

Research questions and friction points this paper is trying to address.

Investigating robustness of data manifold support in generative models
Analyzing infinitesimal learning errors' impact on probability flow
Identifying dynamical mechanisms ensuring generated samples remain on manifold
Innovation

Methods, ideas, or system contributions that make the work stand out.

Perturbation analysis of probability flow
Alignment of top Lyapunov vectors
Finite-time linear perturbation analysis
🔎 Similar Papers
No similar papers found.
Nisha Chandramoorthy
Nisha Chandramoorthy
University of Chicago
Dynamical systemsmachine learning theory
A
Adriaan de Clercq
Department of Statistics, Committee on Computational and Applied Mathematics, The University of Chicago, Chicago, IL, 60637