Approximation and Generalization Abilities of Score-based Neural Network Generative Models for Sub-Gaussian Distributions

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work studies score function estimation for score-based generative models (SGMs) given only $n$ i.i.d. $d$-dimensional samples from an $alpha$-sub-Gaussian true distribution $P_0$. Using deep ReLU neural networks and optimizing under both score matching loss and mean squared error, we establish—*for the first time*—near-optimal generalization rates *without* requiring strong assumptions such as Lipschitz continuity of the score or a lower bound on the data density. Specifically, the mean squared error achieves $ ilde{O}(n^{-1})$, while the score matching loss converges to $ ilde{O}(n^{-1} t_0^{-d/2})$ at time steps $t_0 gtrsim alpha^2 n^{-2/d} log n$. Our theory justifies early stopping to attain nearly minimax-optimal rates. Moreover, by characterizing model capacity via Sobolev and Besov space regularity, we reveal an intrinsic interplay between network architecture and the smoothness of the underlying distribution.

Technology Category

Application Category

📝 Abstract

This paper studies the approximation and generalization abilities of score-based neural network generative models (SGMs) in estimating an unknown distribution $P_0$ from $n$ i.i.d. observations in $d$ dimensions. Assuming merely that $P_0$ is $alpha$-sub-Gaussian, we prove that for any time step $t in [t_0, n^{O(1)}]$, where $t_0 geq O(alpha^2n^{-2/d}log n)$, there exists a deep ReLU neural network with width $leq O(log^3n)$ and depth $leq O(n^{3/d}log_2n)$ that can approximate the scores with $ ilde{O}(n^{-1})$ mean square error and achieve a nearly optimal rate of $ ilde{O}(n^{-1}t_0^{-d/2})$ for score estimation, as measured by the score matching loss. Our framework is universal and can be used to establish convergence rates for SGMs under milder assumptions than previous work. For example, assuming further that the target density function $p_0$ lies in Sobolev or Besov classes, with an appropriately early stopping strategy, we demonstrate that neural network-based SGMs can attain nearly minimax convergence rates up to logarithmic factors. Our analysis removes several crucial assumptions, such as Lipschitz continuity of the score function or a strictly positive lower bound on the target density.

Problem

Research questions and friction points this paper is trying to address.

Estimating unknown sub-Gaussian distributions using score-based neural networks

Achieving optimal convergence rates for score estimation with deep ReLU networks

Removing key assumptions like Lipschitz continuity in score-based generative models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep ReLU neural network for score approximation

Nearly optimal score matching loss rate

Universal framework under mild assumptions

🔎 Similar Papers

SMaRt: Improving GANs with Score Matching Regularity

2023-11-30International Conference on Machine LearningCitations: 5

Gaussian Universality in Neural Network Dynamics with Generalized Structured Input Distributions

2024-05-01Citations: 0

Authors to Follow