MARC: Multimodal and Multi-Task Agentic Retrieval-Augmented Generation for Cold-Start Recommender System

πŸ“… 2025-11-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

195K/year
πŸ€– AI Summary
To address data sparsity, modality heterogeneity, and insufficient knowledge utilization in cold-start recommendation for food and beverage domains, this paper proposes MARCβ€”a Multimodal Adaptive Cocktail Recommendation framework. MARC innovatively integrates graph-enhanced Retrieval-Augmented Generation (RAG) with a Large Language Model (LLM) agent architecture: it constructs a structured cocktail knowledge graph using a graph database; designs a task-identification routing module to dynamically orchestrate multimodal understanding and reasoning subtasks; and incorporates a self-reflection mechanism to enhance contextual consistency and interpretability of generated recommendations. Experimental evaluation on 200 manually curated queries demonstrates that MARC significantly outperforms conventional vector-based retrieval approaches in recommendation quality, as assessed jointly by LLM-based and human evaluation. Notably, MARC exhibits superior knowledge generalization capability and reasoning robustness under cold-start conditions.

Technology Category

Application Category

πŸ“ Abstract
Recommender systems (RS) are currently being studied to mitigate limitations during cold-start conditions by leveraging modality information or introducing Agent concepts based on the exceptional reasoning capabilities of Large Language Models (LLMs). Meanwhile, food and beverage recommender systems have traditionally used knowledge graph and ontology concepts due to the domain's unique data attributes and relationship characteristics. On this background, we propose MARC, a multimodal and multi-task cocktail recommender system based on Agentic Retrieval-Augmented Generation (RAG) utilizing graph database under cold-start conditions. The proposed system generates high-quality, contextually appropriate answers through two core processes: a task recognition router and a reflection process. The graph database was constructed by processing cocktail data from Kaggle, and its effectiveness was evaluated using 200 manually crafted questions. The evaluation used both LLM-as-a-judge and human evaluation to demonstrate that answers generated via the graph database outperformed those from a simple vector database in terms of quality. The code is available at https://github.com/diddbwls/cocktail_rec_agentrag
Problem

Research questions and friction points this paper is trying to address.

Addresses cold-start limitations in recommender systems
Leverages multimodal data and agentic RAG architecture
Improves recommendation quality using graph databases
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multimodal agentic RAG with graph database
Implements task recognition router and reflection process
Leverages graph database to enhance answer quality
πŸ”Ž Similar Papers
No similar papers found.
S
Seung-Hwan Cho
Department of Industrial Data Engineering, Hanyang University, Republic of Korea
Yujin Yang
Yujin Yang
Korea Astronomy and Space Science Institute
Astronomy & Astrophysics
D
Danik Baeck
Department of Industrial Data Engineering, Hanyang University, Republic of Korea
M
Minjoo Kim
Department of Industrial Data Engineering, Hanyang University, Republic of Korea
Young-Min Kim
Young-Min Kim
Associate Professor, Hanyang University
Machine LearningInformation ExtractionProbabilistic ModelsNatural Language Processing
H
Heejung Lee
Department of Industrial Data Engineering, Hanyang University, Republic of Korea; School of Interdisciplinary Industrial Studies, Hanyang University, Republic of Korea
S
Sangjin Park
Department of Industrial Data Engineering, Hanyang University, Republic of Korea; School of Interdisciplinary Industrial Studies, Hanyang University, Republic of Korea