Item-Language Model for Conversational Recommendation

📅 2024-06-05
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
To address three key challenges in directly applying large language models (LLMs) to recommendation systems—scarce and non-textual recommendation data, incompatibility between interaction signals and linguistic modalities, and the risk of catastrophic forgetting of LLMs’ general capabilities—this paper proposes the Item-Language Model (ILM). ILM introduces a novel “language-aligned item encoder” that maps user interaction signals into semantic representations comprehensible to LLMs, thereby decoupling interaction modeling from language understanding. It adopts a parameter-efficient optimization paradigm combining frozen LLM backbones (e.g., LLaMA or Qwen), contrastive learning, and instruction tuning—without updating any LLM parameters. On multi-turn conversational recommendation tasks, ILM significantly outperforms both end-to-end fine-tuned LLMs and conventional recommendation models. Empirical results validate the dual necessity of language-aligned representation learning and effective injection of recommendation-specific knowledge.

Technology Category

Application Category

📝 Abstract
Large-language Models (LLMs) have been extremely successful at tasks like complex dialogue understanding, reasoning and coding due to their emergent abilities. These emergent abilities have been extended with multi-modality to include image, audio, and video capabilities. Recommender systems, on the other hand, have been critical for information seeking and item discovery needs. Recently, there have been attempts to apply LLMs for recommendations. One difficulty of current attempts is that the underlying LLM is usually not trained on the recommender system data, which largely contains user interaction signals and is often not publicly available. Another difficulty is user interaction signals often have a different pattern from natural language text, and it is currently unclear if the LLM training setup can learn more non-trivial knowledge from interaction signals compared with traditional recommender system methods. Finally, it is difficult to train multiple LLMs for different use-cases, and to retain the original language and reasoning abilities when learning from recommender system data. To address these three limitations, we propose an Item-Language Model (ILM), which is composed of an item encoder to produce text-aligned item representations that encode user interaction signals, and a frozen LLM that can understand those item representations with preserved pretrained knowledge. We conduct extensive experiments which demonstrate both the importance of the language-alignment and of user interaction knowledge in the item encoder.
Problem

Research questions and friction points this paper is trying to address.

LLMs lack training on private recommender system data
User interaction signals differ from natural language patterns
Challenges in training multiple LLMs while preserving original abilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Item-Language Model combines item encoder and frozen LLM
Text-aligned item representations encode user interaction signals
Preserves LLM's pretrained knowledge while learning from interactions
🔎 Similar Papers
No similar papers found.
L
Li Yang
Google Research
A
Anushya Subbiah
Google Research
H
Hardik Patel
Google
Judith Yue Li
Judith Yue Li
Google Research
NLUBayesian Filtering
Y
Yanwei Song
Google
Reza Mirghaderi
Reza Mirghaderi
Google
Vikram Aggarwal
Vikram Aggarwal
Google Research