Efficient Diffusion Training through Parallelization with Truncated Karhunen-Lo`eve Expansion

📅 2025-03-22

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Diffusion models suffer from slow training convergence, primarily due to the complexity of modeling the Brownian-motion-driven forward process. This work introduces, for the first time, the Karhunen–Loève (KL) expansion into the diffusion forward process: a truncated KL series approximates Brownian motion, yielding a KL-based diffusion ordinary differential equation (ODE) with stochastic initial conditions. The formulation introduces no additional parameters and inherently enables highly parallelized training. To preserve architectural compatibility, we design a matched denoising loss that requires no modifications to the network architecture or sampling procedure. Experiments demonstrate a 2× speedup in training, significant FID reduction, and seamless plug-and-play integration into existing diffusion frameworks such as DDIM. The core innovation lies in a parameter-free, parallelizable, and theory-grounded reformulation of the forward process—rooted in functional analysis and stochastic process theory.

Technology Category

Application Category

📝 Abstract

Diffusion denoising models have become a popular approach for image generation, but they often suffer from slow convergence during training. In this paper, we identify that this slow convergence is partly due to the complexity of the Brownian motion driving the forward-time process. To address this, we represent the Brownian motion using the Karhunen-Lo`eve expansion, truncating it to a limited number of eigenfunctions. We propose a novel ordinary differential equation with augmented random initials, termed KL diffusion, as a new forward-time process for training and sampling. By developing an appropriate denoising loss function, we facilitate the integration of our KL-diffusion into existing denoising-based models. Using the widely adopted DDIM framework as our baseline ensures a fair comparison, as our modifications focus solely on the forward process and loss function, leaving the network architecture and sampling methods unchanged. Our method significantly outperforms baseline diffusion models, achieving convergence speeds that are twice faster to reach the best FID score of the baseline and ultimately yielding much lower FID scores. Notably, our approach allows for highly parallelized computation, requires no additional learnable parameters, and can be flexibly integrated into existing diffusion methods. The code will be made publicly available.

Problem

Research questions and friction points this paper is trying to address.

Accelerate diffusion model training convergence

Simplify Brownian motion complexity in diffusion

Enable parallel computation without extra parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Truncated Karhunen-Loève expansion for Brownian motion

KL diffusion with augmented random initials

Parallelized computation without extra parameters

🔎 Similar Papers

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training