Denoising as Path Planning: Training-Free Acceleration of Diffusion Models with DPCache

πŸ“… 2026-02-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the high computational cost of multi-step sampling in diffusion models, where existing caching methods often accumulate errors and introduce visual artifacts due to their neglect of the global structure of denoising trajectories. To overcome this limitation, the paper formulates sampling acceleration as a global path planning problem and introduces DPCacheβ€”a training-free framework that constructs a path-aware cost tensor and leverages dynamic programming to select an optimal sequence of key timesteps. By integrating feature caching with a prediction mechanism, DPCache significantly reduces computation while preserving trajectory fidelity. The method achieves up to 4.87Γ— speedup on DiT, FLUX, and HunyuanVideo, and on FLUX it attains a 3.54Γ— acceleration over the full-step baseline with a +0.028 ImageReward gain, substantially outperforming current state-of-the-art approaches.

Technology Category

Application Category

πŸ“ Abstract
Diffusion models have demonstrated remarkable success in image and video generation, yet their practical deployment remains hindered by the substantial computational overhead of multi-step iterative sampling. Among acceleration strategies, caching-based methods offer a training-free and effective solution by reusing or predicting features across timesteps. However, existing approaches rely on fixed or locally adaptive schedules without considering the global structure of the denoising trajectory, often leading to error accumulation and visual artifacts. To overcome this limitation, we propose DPCache, a novel training-free acceleration framework that formulates diffusion sampling acceleration as a global path planning problem. DPCache constructs a Path-Aware Cost Tensor from a small calibration set to quantify the path-dependent error of skipping timesteps conditioned on the preceding key timestep. Leveraging this tensor, DPCache employs dynamic programming to select an optimal sequence of key timesteps that minimizes the total path cost while preserving trajectory fidelity. During inference, the model performs full computations only at these key timesteps, while intermediate outputs are efficiently predicted using cached features. Extensive experiments on DiT, FLUX, and HunyuanVideo demonstrate that DPCache achieves strong acceleration with minimal quality loss, outperforming prior acceleration methods by $+$0.031 ImageReward at 4.87$\times$ speedup and even surpassing the full-step baseline by $+$0.028 ImageReward at 3.54$\times$ speedup on FLUX, validating the effectiveness of our path-aware global scheduling framework. Code will be released at https://github.com/argsss/DPCache.
Problem

Research questions and friction points this paper is trying to address.

diffusion models
sampling acceleration
denoising trajectory
training-free
computational overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion acceleration
path planning
training-free
dynamic programming
feature caching
πŸ”Ž Similar Papers
No similar papers found.
B
Bowen Cui
Alibaba Group
Y
Yuanbin Wang
Alibaba Group
H
Huajiang Xu
Alibaba Group
B
Biaolong Chen
Alibaba Group
A
Aixi Zhang
Alibaba Group
Hao Jiang
Hao Jiang
Alibaba Group
LLM & AIGC
Z
Zhengzheng Jin
Alibaba Group
X
Xu Liu
Alibaba Group
P
Pipei Huang
Alibaba Group