Bringing Reasoning to Generative Recommendation Through the Lens of Cascaded Ranking

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Generative recommender models are prone to amplifying biases, which undermines recommendation diversity and degrades user experience. To address this issue, this work proposes the CARE framework, which introduces a cascaded reasoning mechanism into generative recommendation for the first time. By integrating progressive historical encoding with query-anchored parallel reasoning, CARE dynamically fuses heterogeneous information and adaptively allocates computational resources, effectively mitigating bias amplification. Extensive experiments on four benchmark datasets demonstrate that CARE significantly improves recommendation accuracy, diversity, and inference efficiency, while also exhibiting strong scalability.

Technology Category

Application Category

📝 Abstract

Generative Recommendation (GR) has become a promising end-to-end approach with high FLOPS utilization for resource-efficient recommendation. Despite the effectiveness, we show that current GR models suffer from a critical \textbf{bias amplification} issue, where token-level bias escalates as token generation progresses, ultimately limiting the recommendation diversity and hurting the user experience. By comparing against the key factor behind the success of traditional multi-stage pipelines, we reveal two limitations in GR that can amplify the bias: homogeneous reliance on the encoded history, and fixed computational budgets that prevent deeper user preference understanding. To combat the bias amplification issue, it is crucial for GR to 1) incorporate more heterogeneous information, and 2) allocate greater computational resources at each token generation step. To this end, we propose CARE, a simple yet effective cascaded reasoning framework for debiased GR. To incorporate heterogeneous information, we introduce a progressive history encoding mechanism, which progressively incorporates increasingly fine-grained history information as the generation process advances. To allocate more computations, we propose a query-anchored reasoning mechanism, which seeks to perform a deeper understanding of historical information through parallel reasoning steps. We instantiate CARE on three GR backbones. Empirical results on four datasets show the superiority of CARE in recommendation accuracy, diversity, efficiency, and promising scalability. The codes and datasets are available at https://github.com/Linxyhaha/CARE.

Problem

Research questions and friction points this paper is trying to address.

bias amplification

generative recommendation

recommendation diversity

token-level bias

user experience

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Recommendation

Bias Amplification

Cascaded Reasoning