A Survey on Large Language Models in Multimodal Recommender Systems

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenges of performance enhancement, scalability, and accessibility in the deep integration of large language models (LLMs) and multimodal recommendation systems (MRS). To this end, we propose the first structured taxonomy specifically designed for LLM-MRS fusion, establishing three novel paradigms: semantic reasoning, in-context learning, and dynamic input processing. Methodologically, we systematically unify key techniques—including prompt engineering, parameter-efficient fine-tuning (e.g., LoRA), multimodal alignment modeling, and cross-domain knowledge transfer. Furthermore, we comprehensively categorize 12 mainstream approaches, 7 benchmark datasets, and 5 core evaluation metrics, and introduce a standardized evaluation framework. Our contributions provide a clear technical roadmap, reusable methodological principles, and a systematic research guide for advancing LLM-powered multimodal recommendation.

Technology Category

Application Category

📝 Abstract

Multimodal recommender systems (MRS) integrate heterogeneous user and item data, such as text, images, and structured information, to enhance recommendation performance. The emergence of large language models (LLMs) introduces new opportunities for MRS by enabling semantic reasoning, in-context learning, and dynamic input handling. Compared to earlier pre-trained language models (PLMs), LLMs offer greater flexibility and generalisation capabilities but also introduce challenges related to scalability and model accessibility. This survey presents a comprehensive review of recent work at the intersection of LLMs and MRS, focusing on prompting strategies, fine-tuning methods, and data adaptation techniques. We propose a novel taxonomy to characterise integration patterns, identify transferable techniques from related recommendation domains, provide an overview of evaluation metrics and datasets, and point to possible future directions. We aim to clarify the emerging role of LLMs in multimodal recommendation and support future research in this rapidly evolving field.

Problem

Research questions and friction points this paper is trying to address.

How LLMs enhance multimodal recommender systems' performance

Challenges of scalability and accessibility in LLM-based MRS

Integration patterns and techniques for LLMs in MRS

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizing LLMs for semantic reasoning in MRS

Exploring prompting strategies and fine-tuning methods

Proposing taxonomy for LLM-MRS integration patterns

🔎 Similar Papers

MMREC: LLM Based Multi-Modal Recommender System

2024-08-08International Workshop on Semantic and Social Media Adaptation and PersonalizationCitations: 13

Authors to Follow