REG4Rec: Reasoning-Enhanced Generative Model for Large-Scale Recommendation Systems

📅 2025-08-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional generative recommender systems suffer from impoverished and unreliable inference due to oversimplified, single-view item semantic representations. To address this, we propose a Multi-Dynamic Semantic Reasoning framework for sequential recommendation. Our approach features: (1) multi-semantic inference paths enabled by a Mixture-of-Experts (MoE)-based parallel quantized codebook for fine-grained semantic modeling; (2) a consistency-guided self-reflective pruning mechanism to enhance inference reliability; and (3) a preference-aligned, multi-step reward-augmented training strategy that jointly optimizes recommendation diversity and accuracy. Extensive experiments on multiple real-world benchmark datasets and large-scale online A/B tests demonstrate consistent and significant improvements over state-of-the-art methods. The framework achieves superior generalization capability and robust industrial deployability, validating its effectiveness in practical recommendation scenarios.

Technology Category

Application Category

📝 Abstract
Sequential recommendation aims to predict a user's next action in large-scale recommender systems. While traditional methods often suffer from insufficient information interaction, recent generative recommendation models partially address this issue by directly generating item predictions. To better capture user intents, recent studies have introduced a reasoning process into generative recommendation, significantly improving recommendation performance. However, these approaches are constrained by the singularity of item semantic representations, facing challenges such as limited diversity in reasoning pathways and insufficient reliability in the reasoning process. To tackle these issues, we introduce REG4Rec, a reasoning-enhanced generative model that constructs multiple dynamic semantic reasoning paths alongside a self-reflection process, ensuring high-confidence recommendations. Specifically, REG4Rec utilizes an MoE-based parallel quantization codebook (MPQ) to generate multiple unordered semantic tokens for each item, thereby constructing a larger-scale diverse reasoning space. Furthermore, to enhance the reliability of reasoning, we propose a training reasoning enhancement stage, which includes Preference Alignment for Reasoning (PARS) and a Multi-Step Reward Augmentation (MSRA) strategy. PARS uses reward functions tailored for recommendation to enhance reasoning and reflection, while MSRA introduces future multi-step actions to improve overall generalization. During inference, Consistency-Oriented Self-Reflection for Pruning (CORP) is proposed to discard inconsistent reasoning paths, preventing the propagation of erroneous reasoning. Lastly, we develop an efficient offline training strategy for large-scale recommendation. Experiments on real-world datasets and online evaluations show that REG4Rec delivers outstanding performance and substantial practical value.
Problem

Research questions and friction points this paper is trying to address.

Overcoming limited diversity in reasoning pathways for sequential recommendation systems
Addressing insufficient reliability in generative recommendation reasoning processes
Enhancing user intent capture through multiple semantic representations in recommendations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multiple dynamic semantic reasoning paths
MoE-based parallel quantization codebook
Consistency-oriented self-reflection pruning
🔎 Similar Papers
No similar papers found.
H
Haibo Xing
Alibaba International Digital Commerce Group, Hangzhou, China
Hao Deng
Hao Deng
Engineer
recommendation system
Yucheng Mao
Yucheng Mao
UC San Diego
3D Computer Vision
Jinxin Hu
Jinxin Hu
Alibaba
Y
Yi Xu
Alibaba International Digital Commerce Group, Beijing, China
H
Hao Zhang
Alibaba International Digital Commerce Group, Beijing, China
J
Jiahao Wang
Alibaba International Digital Commerce Group, Beijing, China
S
Shizhun Wang
Alibaba International Digital Commerce Group, Beijing, China
Y
Yu Zhang
Alibaba International Digital Commerce Group, Beijing, China
X
Xiaoyi Zeng
Alibaba International Digital Commerce Group, Hangzhou, China
J
Jing Zhang
School of Computer Science, Wuhan University, Wuhan, China