🤖 AI Summary
Addressing the dual challenges of dynamic user preference modeling and efficient long-sequence inference in sequential recommendation, this paper systematically investigates State Space Models (SSMs), reference-free preference optimization (ORPO), and adaptive training strategies. We introduce SSMs—specifically Mamba-based architectures—to sequential recommendation for the first time, demonstrating a 40% reduction in inference memory and a 35% decrease in latency compared to Transformer-based baselines. We propose a single-stage, LLM-driven ORPO framework that optimizes recommendation relevance without requiring a fixed reference model, yielding a 2.1% improvement in NDCG@10. Additionally, we design an adaptive batch-size and learning-rate scheduling algorithm that reduces total training time by 28%. Collectively, these three innovations establish a new paradigm for sequential recommendation that achieves high accuracy, low computational overhead, and rapid convergence.
📝 Abstract
Recommender systems aim to estimate the dynamically changing user preferences and sequential dependencies between historical user behaviour and metadata. Although transformer-based models have proven to be effective in sequential recommendations, their state growth is proportional to the length of the sequence that is being processed, which makes them expensive in terms of memory and inference costs. Our research focused on three promising directions in sequential recommendations: enhancing speed through the use of State Space Models (SSM), as they can achieve SOTA results in the sequential recommendations domain with lower latency, memory, and inference costs, as proposed by arXiv:2403.03900 improving the quality of recommendations with Large Language Models (LLMs) via Monolithic Preference Optimization without Reference Model (ORPO); and implementing adaptive batch- and step-size algorithms to reduce costs and accelerate training processes.