DGenCTR: Towards a Universal Generative Paradigm for Click-Through Rate Prediction via Discrete Diffusion

📅 2025-08-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing generative recommendation models primarily focus on sequential item generation and are ill-suited for click-through rate (CTR) prediction, which critically depends on fine-grained user-item interaction features. To alleviate the performance bottleneck of discriminative models under label-sparse scenarios, we propose DGenCTR—the first discrete diffusion-based generative framework tailored for CTR prediction. Our approach addresses this challenge through three key innovations: (1) introducing discrete diffusion into CTR pretraining to establish a sample-level generative paradigm; (2) explicitly modeling user-item cross features to tightly integrate generative capability with discriminative CTR estimation; and (3) adopting a two-stage training strategy—discrete diffusion pretraining followed by supervised fine-tuning. Extensive experiments on multiple public benchmarks and online A/B tests demonstrate that DGenCTR consistently achieves significant improvements in both CTR prediction accuracy and generalization performance.

Technology Category

Application Category

📝 Abstract
Recent advances in generative models have inspired the field of recommender systems to explore generative approaches, but most existing research focuses on sequence generation, a paradigm ill-suited for click-through rate (CTR) prediction. CTR models critically depend on a large number of cross-features between the target item and the user to estimate the probability of clicking on the item, and discarding these cross-features will significantly impair model performance. Therefore, to harness the ability of generative models to understand data distributions and thereby alleviate the constraints of traditional discriminative models in label-scarce space, diverging from the item-generation paradigm of sequence generation methods, we propose a novel sample-level generation paradigm specifically designed for the CTR task: a two-stage Discrete Diffusion-Based Generative CTR training framework (DGenCTR). This two-stage framework comprises a diffusion-based generative pre-training stage and a CTR-targeted supervised fine-tuning stage for CTR. Finally, extensive offline experiments and online A/B testing conclusively validate the effectiveness of our framework.
Problem

Research questions and friction points this paper is trying to address.

Develops generative CTR prediction without discarding cross-features
Proposes discrete diffusion model for sample-level generation paradigm
Addresses label scarcity through generative pre-training and fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete diffusion generative model
Two-stage training framework
Sample-level generation paradigm
🔎 Similar Papers
No similar papers found.