🤖 AI Summary
Diffusion models suffer from high sampling latency due to sequential denoising; existing acceleration methods degrade image quality sharply at low step counts, primarily because they fail to accurately approximate high-curvature ODE trajectories, leading to accumulating truncation errors. This paper proposes EPD-Solver—the first ODE solver for diffusion sampling that supports parallel multi-gradient evaluation and manifold-geometric approximation. We design a reinforcement learning fine-tuning framework with residual Dirichlet policy to efficiently optimize within a low-dimensional solver space, and introduce a plug-and-play EPD-Plugin mechanism for seamless integration with mainstream ODE samplers. Our method unifies the vector-form mean value theorem, distillation-based initialization, and manifold-constrained solving. On text-to-image generation, EPD-Solver achieves state-of-the-art performance in just 10 sampling steps—improving FID by 37% and reducing sampling latency by 4.2× compared to prior methods.
📝 Abstract
Diffusion models (DMs) have achieved state-of-the-art generative performance but suffer from high sampling latency due to their sequential denoising nature. Existing solver-based acceleration methods often face significant image quality degradation under a low-latency budget, primarily due to accumulated truncation errors arising from the inability to capture high-curvature trajectory segments. In this paper, we propose the Ensemble Parallel Direction solver (dubbed as EPD-Solver), a novel ODE solver that mitigates these errors by incorporating multiple parallel gradient evaluations in each step. Motivated by the geometric insight that sampling trajectories are largely confined to a low-dimensional manifold, EPD-Solver leverages the Mean Value Theorem for vector-valued functions to approximate the integral solution more accurately. Importantly, since the additional gradient computations are independent, they can be fully parallelized, preserving low-latency sampling nature. We introduce a two-stage optimization framework. Initially, EPD-Solver optimizes a small set of learnable parameters via a distillation-based approach. We further propose a parameter-efficient Reinforcement Learning (RL) fine-tuning scheme that reformulates the solver as a stochastic Dirichlet policy. Unlike traditional methods that fine-tune the massive backbone, our RL approach operates strictly within the low-dimensional solver space, effectively mitigating reward hacking while enhancing performance in complex text-to-image (T2I) generation tasks. In addition, our method is flexible and can serve as a plugin (EPD-Plugin) to improve existing ODE samplers.