🤖 AI Summary
In generative recommendation, jointly modeling the discrete positional indices and continuous wall-clock timestamps of user behaviors remains challenging. This paper proposes TimeRoPE, the first method to jointly leverage both timestamps and positional indices as sources for rotation angles in Rotary Position Embedding (RoPE). We design three RoPE variants—early fusion, dimension-wise splitting, and attention-head-wise splitting—to enable fine-grained geometric modeling of temporal-sequential dependencies. TimeRoPE integrates seamlessly into autoregressive Transformers without introducing extra parameters or post-processing. Extensive experiments on multiple public and industrial datasets demonstrate that TimeRoPE significantly outperforms mainstream time-encoding schemes, achieving a superior trade-off between recommendation accuracy and inference efficiency. It delivers strong empirical performance while maintaining deployment friendliness.
📝 Abstract
Generative recommenders, typically transformer-based autoregressive models, predict the next item or action from a user's interaction history. Their effectiveness depends on how the model represents where an interaction event occurs in the sequence (discrete index) and when it occurred in wall-clock time. Prevailing approaches inject time via learned embeddings or relative attention biases. In this paper, we argue that RoPE-based approaches, if designed properly, can be a stronger alternative for jointly modeling temporal and sequential information in user behavior sequences. While vanilla RoPE in LLMs considers only token order, generative recommendation requires incorporating both event time and token index. To address this, we propose Time-and-Order RoPE (TO-RoPE), a family of rotary position embedding designs that treat index and time as angle sources shaping the query-key geometry directly. We present three instantiations: early fusion, split-by-dim, and split-by-head. Extensive experiments on both publicly available datasets and a proprietary industrial dataset show that TO-RoPE variants consistently improve accuracy over existing methods for encoding time and index. These results position rotary embeddings as a simple, principled, and deployment-friendly foundation for generative recommendation.