Parallel Jacobi Decoding for Fast Autoregressive Image Generation

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the inefficiency of autoregressive image generation models, which suffer from slow inference due to sequential token-by-token decoding and error propagation inherent in existing acceleration methods constrained to one-dimensional sequences. The study proposes, for the first time, a training-free parallel decoding strategy that extends Jacobi-style decoding into the two-dimensional spatial domain. By leveraging local spatial correlations in images, the method predicts candidate tokens in parallel within two-dimensional neighborhoods and introduces specially designed attention masks to ensure convergence stability. This approach effectively mitigates error accumulation, achieving 4.8–6.4× inference speedup across multiple autoregressive models and datasets while maintaining competitive generation quality.

📝 Abstract

Autoregressive (AR) models have demonstrated remarkable performance in generating high-fidelity images. However, their inherently sequential next-token prediction leads to significantly slower inference. Recent studies have introduced Jacobi-style decoding to accelerate autoregressive image generation. Extending the draft sequence initially improves efficiency, yet the acceleration quickly saturates as error propagation in the one-dimensional sequence hinders convergence. Observing that images exhibit strong local spatial correlations, we propose Parallel Jacobi Decoding (PJD), a training-free decoding approach that expands draft tokens in the two-dimensional spatial domain to enable efficient spatially parallel refinement. PJD adjusts the attention mask to mitigate error accumulation and improve convergence stability. Extensive experiments on diverse datasets show that PJD achieves 4.8x-6.4x acceleration across multiple autoregressive image generation models while maintaining competitive generation quality.

Problem

Research questions and friction points this paper is trying to address.

autoregressive image generation

slow inference

error propagation

sequential decoding

acceleration saturation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel Jacobi Decoding

autoregressive image generation

spatially parallel refinement