The Layout Is the Model: On Action-Item Coupling in Generative Recommendation

πŸ“… 2025-10-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper addresses the challenge of coupled item-action modeling in generative recommendation (GR). We identify three essential principles for token layout design: (i) maximizing discriminative signals for items and actions, (ii) preserving the conditional dependency structure wherein actions are conditioned on preceding items, and (iii) preventing information leakage. To satisfy these principles, we propose Lagged Action Conditioning (LAC), a novel non-interleaved layout that explicitly models the lagged dependency of actions on prior itemsβ€”thereby relaxing the implicit assumption of strict sequential consistency inherent in conventional interleaved layouts. LAC retains the autoregressive paradigm while significantly reducing computational cost (up to 42% fewer FLOPs). Empirically, it matches or surpasses interleaved baselines across multiple benchmarks, achieving 1.8–3.2% absolute gains in Recall@10. To our knowledge, LAC is the first approach enabling efficient, leakage-free, and conditionally sound coupled sequence modeling for GR.

Technology Category

Application Category

πŸ“ Abstract
Generative Recommendation (GR) models treat a user's interaction history as a sequence to be autoregressively predicted. When both items and actions (e.g., watch time, purchase, comment) are modeled, the layout-the ordering and visibility of item/action tokens-critically determines what information the model can use and how it generalizes. We present a unified study of token layouts for GR grounded in first principles: (P1) maximize item/action signal in both input/output space, (P2) preserve the conditioning relationship "action given item" and (P3) no information leakage. While interleaved layout (where item and action occupy separate tokens) naturally satisfies these principles, it also bloats sequence length with larger training/inference cost. On the non-interleaved front, we design a novel and effective approach, Lagged Action Conditioning (LAC), which appears strange on the surface but aligns well with the design principles to yield strong accuracy. Comprehensive experiments on public datasets and large-scale production logs evaluate different layout options and empirically verifies the design principles. Our proposed non-interleaved method, LAC, achieves competitive or superior quality at substantially lower FLOPs than interleaving. Our findings offer actionable guidance for assembling GR systems that are both accurate and efficient.
Problem

Research questions and friction points this paper is trying to address.

Optimizing token layout design for generative recommendation models
Balancing model accuracy with computational efficiency in sequences
Preventing information leakage while preserving item-action relationships
Innovation

Methods, ideas, or system contributions that make the work stand out.

LAC method conditions actions on items laggedly
Non-interleaved layout reduces sequence length significantly
Design principles maximize signal while preventing leakage
X
Xiaokai Wei
Roblox, San Mateo, California, USA
J
Jiajun Wu
Roblox, San Mateo, California, USA
D
Daiyao Yi
Roblox, San Mateo, California, USA
Reza Shirkavand
Reza Shirkavand
University of Maryland
Efficient Deep Learning
M
Michelle Gong
Roblox, San Mateo, California, USA