Parallel Jacobi Decoding for Fast Autoregressive Image Generation

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

222K/year
🤖 AI Summary
This work addresses the inefficiency of autoregressive image generation models, which suffer from slow inference due to sequential token-by-token decoding and error propagation inherent in existing acceleration methods constrained to one-dimensional sequences. The study proposes, for the first time, a training-free parallel decoding strategy that extends Jacobi-style decoding into the two-dimensional spatial domain. By leveraging local spatial correlations in images, the method predicts candidate tokens in parallel within two-dimensional neighborhoods and introduces specially designed attention masks to ensure convergence stability. This approach effectively mitigates error accumulation, achieving 4.8–6.4× inference speedup across multiple autoregressive models and datasets while maintaining competitive generation quality.
📝 Abstract
Autoregressive (AR) models have demonstrated remarkable performance in generating high-fidelity images. However, their inherently sequential next-token prediction leads to significantly slower inference. Recent studies have introduced Jacobi-style decoding to accelerate autoregressive image generation. Extending the draft sequence initially improves efficiency, yet the acceleration quickly saturates as error propagation in the one-dimensional sequence hinders convergence. Observing that images exhibit strong local spatial correlations, we propose Parallel Jacobi Decoding (PJD), a training-free decoding approach that expands draft tokens in the two-dimensional spatial domain to enable efficient spatially parallel refinement. PJD adjusts the attention mask to mitigate error accumulation and improve convergence stability. Extensive experiments on diverse datasets show that PJD achieves 4.8x-6.4x acceleration across multiple autoregressive image generation models while maintaining competitive generation quality.
Problem

Research questions and friction points this paper is trying to address.

autoregressive image generation
slow inference
error propagation
sequential decoding
acceleration saturation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel Jacobi Decoding
autoregressive image generation
spatially parallel refinement
attention mask adjustment
error propagation mitigation
🔎 Similar Papers