DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation

📅 2025-05-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient exploration of the item space in recommender systems, this paper proposes a deep exploration framework based on autonomous, multi-round collaboration between large language models (LLMs) and traditional recommendation models (TRMs). The LLM infers fine-grained user preferences from interaction history, while the TRM efficiently retrieves candidate items; their closed-loop, iterative interaction balances exploration and exploitation. Key contributions include: (1) the first LLM-TRM autonomous multi-round collaborative paradigm; (2) a hierarchical reward function explicitly designed for the recommendation process; and (3) a two-stage reinforcement learning strategy integrating preference-aware data rollback with performance optimization. Extensive experiments on multiple public benchmarks demonstrate significant improvements over state-of-the-art traditional and LLM-based recommenders, validating both the effectiveness and generalizability of deep item-space exploration.

Technology Category

Application Category

📝 Abstract
Recently, large language models (LLMs) have been introduced into recommender systems (RSs), either to enhance traditional recommendation models (TRMs) or serve as recommendation backbones. However, existing LLM-based RSs often do not fully exploit the complementary advantages of LLMs (e.g., world knowledge and reasoning) and TRMs (e.g., recommendation-specific knowledge and efficiency) to fully explore the item space. To address this, we propose DeepRec, a novel LLM-based RS that enables autonomous multi-turn interactions between LLMs and TRMs for deep exploration of the item space. In each interaction turn, LLMs reason over user preferences and interact with TRMs to retrieve candidate items. After multi-turn interactions, LLMs rank the retrieved items to generate the final recommendations. We adopt reinforcement learning(RL) based optimization and propose novel designs from three aspects: recommendation model based data rollout, recommendation-oriented hierarchical rewards, and a two-stage RL training strategy. For data rollout, we introduce a preference-aware TRM, with which LLMs interact to construct trajectory data. For rewards, we design a hierarchical reward function that involves both process-level and outcome-level rewards to optimize the interaction process and recommendation performance, respectively. For RL training, we develop a two-stage training strategy, where the first stage aims to guide LLMs to interact with TRMs and the second stage focuses on performance improvement. Experiments on public datasets demonstrate that DeepRec significantly outperforms both traditional and LLM-based baselines, offering a new paradigm for deep exploration in recommendation systems.
Problem

Research questions and friction points this paper is trying to address.

Combining LLMs and TRMs for deeper item space exploration
Enhancing recommendation via multi-turn LLM-TRM interactions
Optimizing interaction process with hierarchical RL rewards
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-turn LLM-TRM interactions for deep item exploration
Reinforcement learning with hierarchical reward design
Two-stage RL training for optimized recommendation performance
🔎 Similar Papers
No similar papers found.
B
Bowen Zheng
Renmin University of China, Beijing, China
X
Xiaolei Wang
Renmin University of China, Beijing, China
Enze Liu
Enze Liu
Renmin University of China
Recommender SystemsLarge Language Models
X
Xi Wang
Beijing Institute of Technology, Beijing, China
H
Hongyu Lu
WeChat, Tencent, Guangzhou, China
Y
Yu Chen
WeChat, Tencent, Beijing, China
Wayne Xin Zhao
Wayne Xin Zhao
Professor, Renmin University of China
Recommender SystemNatural Language ProcessingLarge Language Model
Ji-Rong Wen
Ji-Rong Wen
Gaoling School of Artificial Intelligence, Renmin University of China
Large Language ModelWeb SearchInformation RetrievalMachine Learning