Masked Diffusion Generative Recommendation

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses key limitations in generative recommendation—namely, the inability of autoregressive decoding to capture global dependencies among multidimensional features, the fixed ordering of user attribute attention, and low inference efficiency—by proposing MDGR, the first framework to integrate diffusion models into generative recommendation. MDGR introduces a parallel codebook architecture, an adaptive masking strategy across both temporal and sample dimensions, and a two-stage parallel decoding mechanism, which collectively enhance personalized representation while significantly accelerating inference. Extensive experiments demonstrate that MDGR outperforms ten state-of-the-art methods by up to 10.78% on average across multiple public and industrial datasets. Furthermore, its deployment on a live advertising platform yielded a 1.20% increase in revenue.

Technology Category

Application Category

📝 Abstract
Generative recommendation (GR) typically first quantizes continuous item embeddings into multi-level semantic IDs (SIDs), and then generates the next item via autoregressive decoding. Although existing methods are already competitive in terms of recommendation performance, directly inheriting the autoregressive decoding paradigm from language models still suffers from three key limitations: (1) autoregressive decoding struggles to jointly capture global dependencies among the multi-dimensional features associated with different positions of SID; (2) using a unified, fixed decoding path for the same item implicitly assumes that all users attend to item attributes in the same order; (3) autoregressive decoding is inefficient at inference time and struggles to meet real-time requirements. To tackle these challenges, we propose MDGR, a Masked Diffusion Generative Recommendation framework that reshapes the GR pipeline from three perspectives: codebook, training, and inference. (1) We adopt a parallel codebook to provide a structural foundation for diffusion-based GR. (2) During training, we adaptively construct masking supervision signals along both the temporal and sample dimensions. (3) During inference, we develop a warm-up-based two-stage parallel decoding strategy for efficient generation of SIDs. Extensive experiments on multiple public and industrial-scale datasets show that MDGR outperforms ten state-of-the-art baselines by up to 10.78%. Furthermore, by deploying MDGR on a large-scale online advertising platform, we achieve a 1.20% increase in revenue, demonstrating its practical value.
Problem

Research questions and friction points this paper is trying to address.

generative recommendation
autoregressive decoding
semantic IDs
inference efficiency
global dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked Diffusion
Generative Recommendation
Parallel Codebook
Adaptive Masking
Two-stage Parallel Decoding
🔎 Similar Papers
L
Lingyu Mu
Alibaba International Digital Commerce Group, Beijing, China
Hao Deng
Hao Deng
Engineer
recommendation system
H
Haibo Xing
Alibaba International Digital Commerce Group, Hangzhou, China
Jinxin Hu
Jinxin Hu
Alibaba
Y
Yu Zhang
Alibaba International Digital Commerce Group, Beijing, China
X
Xiaoyi Zeng
Alibaba International Digital Commerce Group, Hangzhou, China
J
Jing Zhang
Wuhan University, Wuhan, China