🤖 AI Summary
Generative recommender systems face two key challenges: (i) token-item granularity mismatch—uniform token-level modeling overlooks item-level collaborative signals; and (ii) semantic-collaborative signal coupling—heterogeneous semantic and collaborative distributions share a single embedding space, causing optimization conflicts. To address these, we propose DiscRec, a decoupled framework with three core contributions: (i) the first item-level positional encoding to explicitly align representations at the item granularity; (ii) a dual-branch Transformer architecture that separately models semantic and collaborative signals, augmented with localized collaborative attention to enhance intra-sequence item interactions; and (iii) an adaptive gating mechanism for flexible fusion of semantic and collaborative representations. Extensive experiments on four real-world datasets demonstrate that DiscRec significantly outperforms state-of-the-art methods, validating both the effectiveness and generalizability of signal decoupling for generative recommendation.
📝 Abstract
Generative recommendation is emerging as a powerful paradigm that directly generates item predictions, moving beyond traditional matching-based approaches. However, current methods face two key challenges: token-item misalignment, where uniform token-level modeling ignores item-level granularity that is critical for collaborative signal learning, and semantic-collaborative signal entanglement, where collaborative and semantic signals exhibit distinct distributions yet are fused in a unified embedding space, leading to conflicting optimization objectives that limit the recommendation performance. To address these issues, we propose DiscRec, a novel framework that enables Disentangled Semantic-Collaborative signal modeling with flexible fusion for generative Recommendation.First, DiscRec introduces item-level position embeddings, assigned based on indices within each semantic ID, enabling explicit modeling of item structure in input token sequences.Second, DiscRec employs a dual-branch module to disentangle the two signals at the embedding layer: a semantic branch encodes semantic signals using original token embeddings, while a collaborative branch applies localized attention restricted to tokens within the same item to effectively capture collaborative signals. A gating mechanism subsequently fuses both branches while preserving the model's ability to model sequential dependencies. Extensive experiments on four real-world datasets demonstrate that DiscRec effectively decouples these signals and consistently outperforms state-of-the-art baselines. Our codes are available on https://github.com/Ten-Mao/DiscRec.