MagCache: Fast Video Generation with Magnitude-Aware Cache

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing video diffusion acceleration methods rely on uniform heuristics or time-embedding variants, requiring extensive prompt-specific calibration—leading to prompt overfitting and output inconsistency. This work identifies a cross-model and cross-prompt universal monotonic decay law governing residual magnitude, and proposes an adaptive amplitude-aware caching mechanism calibrated from only a single sample: it models reconstruction error via residual magnitude ratios to enable dynamic timestep skipping and adaptive feature reuse. Evaluated on Open-Sora and Wan 2.1, our method achieves 2.1× and 2.68× inference speedup, respectively, while consistently outperforming state-of-the-art approaches in LPIPS, SSIM, and PSNR. The core contribution is the first discovery of this universal residual magnitude decay law, enabling a lightweight, highly generalizable, single-sample adaptive acceleration paradigm for video diffusion models.

Technology Category

Application Category

📝 Abstract
Existing acceleration techniques for video diffusion models often rely on uniform heuristics or time-embedding variants to skip timesteps and reuse cached features. These approaches typically require extensive calibration with curated prompts and risk inconsistent outputs due to prompt-specific overfitting. In this paper, we introduce a novel and robust discovery: a unified magnitude law observed across different models and prompts. Specifically, the magnitude ratio of successive residual outputs decreases monotonically and steadily in most timesteps while rapidly in the last several steps. Leveraging this insight, we introduce a Magnitude-aware Cache (MagCache) that adaptively skips unimportant timesteps using an error modeling mechanism and adaptive caching strategy. Unlike existing methods requiring dozens of curated samples for calibration, MagCache only requires a single sample for calibration. Experimental results show that MagCache achieves 2.1x and 2.68x speedups on Open-Sora and Wan 2.1, respectively, while preserving superior visual fidelity. It significantly outperforms existing methods in LPIPS, SSIM, and PSNR, under comparable computational budgets.
Problem

Research questions and friction points this paper is trying to address.

Accelerate video diffusion models without output inconsistency
Reduce calibration samples needed for adaptive caching
Maintain visual fidelity while improving generation speed
Innovation

Methods, ideas, or system contributions that make the work stand out.

Magnitude-aware Cache for adaptive timestep skipping
Error modeling mechanism for robust feature reuse
Single-sample calibration for efficient video generation
🔎 Similar Papers
No similar papers found.
Z
Zehong Ma
State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University
Longhui Wei
Longhui Wei
Senior Researcher, Huawei
multimodal&Visual pre-trainingVLMMultimodal Generation
F
Feng Wang
Huawei Inc.
Shiliang Zhang
Shiliang Zhang
Department of Computer Science, School of EECS, Peking University
Multimedia Information RetrievalMultimedia SystemsVisual Search
Q
Qi Tian
Huawei Inc.