🤖 AI Summary
To address severe data sparsity and cold-start challenges in long-tail item recommendation within e-commerce scenarios, this paper proposes an LLM-driven framework integrating semantic understanding and user intent modeling. We introduce a Semantic Visor to derive fine-grained semantic representations from item textual descriptions and design an attention-driven user intent encoder to capture long-tail interest patterns. Furthermore, we propose a tri-path fusion ranking mechanism—combining semantic, collaborative, and generative signals—that dynamically weights and integrates heterogeneous information sources. Evaluated on real-world e-commerce datasets, our method achieves significant improvements: +12% in recall, +9% in hit rate, and +15% in user coverage, substantially enhancing long-tail item exposure and conversion. The framework establishes a scalable, semantics-enhanced paradigm for recommender systems operating under extreme sparsity.
📝 Abstract
As e-commerce platforms expand their product catalogs, accurately recommending long-tail items becomes increasingly important for enhancing both user experience and platform revenue. A key challenge is the long-tail problem, where extreme data sparsity and cold-start issues limit the performance of traditional recommendation methods. To address this, we propose a novel long-tail product recommendation mechanism that integrates product text descriptions and user behavior sequences using a large-scale language model (LLM). First, we introduce a semantic visor, which leverages a pre-trained LLM to convert multimodal textual content such as product titles, descriptions, and user reviews into meaningful embeddings. These embeddings help represent item-level semantics effectively. We then employ an attention-based user intent encoder that captures users' latent interests, especially toward long-tail items, by modeling collaborative behavior patterns. These components feed into a hybrid ranking model that fuses semantic similarity scores, collaborative filtering outputs, and LLM-generated recommendation candidates. Extensive experiments on a real-world e-commerce dataset show that our method outperforms baseline models in recall (+12%), hit rate (+9%), and user coverage (+15%). These improvements lead to better exposure and purchase rates for long-tail products. Our work highlights the potential of LLMs in interpreting product content and user intent, offering a promising direction for future e-commerce recommendation systems.