ART for Diffusion Sampling: A Reinforcement Learning Approach to Timestep Schedule

📅 2026-01-26

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the suboptimal generation quality of existing diffusion models under a fixed sampling step budget, which typically rely on uniform or handcrafted time-step schedules. To overcome this limitation, the authors propose the Adaptive Reparameterized Time (ART) framework, which formulates time-step scheduling as a continuous-time reinforcement learning problem for the first time. By controlling the “clock speed” of reparameterized time, ART enables non-uniform scheduling that minimizes the cumulative error from Euler discretization. The approach integrates a Gaussian policy, an Actor-Critic algorithm, and the EDM sampling pipeline, enabling end-to-end, data-driven optimization with theoretical guarantees of recovering the optimal schedule. Experiments demonstrate that ART significantly improves FID on CIFAR-10 and generalizes effectively—without retraining—to diverse datasets such as AFHQv2, FFHQ, and ImageNet across various sampling budgets.

Technology Category

Application Category

📝 Abstract

We consider time discretization for score-based diffusion models to generate samples from a learned reverse-time dynamic on a finite grid. Uniform and hand-crafted grids can be suboptimal given a budget on the number of time steps. We introduce Adaptive Reparameterized Time (ART) that controls the clock speed of a reparameterized time variable, leading to a time change and uneven timesteps along the sampling trajectory while preserving the terminal time. The objective is to minimize the aggregate error arising from the discretized Euler scheme. We derive a randomized control companion, ART-RL, and formulate time change as a continuous-time reinforcement learning (RL) problem with Gaussian policies. We then prove that solving ART-RL recovers the optimal ART schedule, which in turn enables practical actor--critic updates to learn the latter in a data-driven way. Empirically, based on the official EDM pipeline, ART-RL improves Fr\'echet Inception Distance on CIFAR-10 over a wide range of budgets and transfers to AFHQv2, FFHQ, and ImageNet without the need of retraining.

Problem

Research questions and friction points this paper is trying to address.

diffusion sampling

time discretization

score-based models

timestep schedule

sampling error

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Reparameterized Time

diffusion sampling

reinforcement learning