MARC: Multimodal and Multi-Task Agentic Retrieval-Augmented Generation for Cold-Start Recommender System

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

To address data sparsity, modality heterogeneity, and insufficient knowledge utilization in cold-start recommendation for food and beverage domains, this paper proposes MARC—a Multimodal Adaptive Cocktail Recommendation framework. MARC innovatively integrates graph-enhanced Retrieval-Augmented Generation (RAG) with a Large Language Model (LLM) agent architecture: it constructs a structured cocktail knowledge graph using a graph database; designs a task-identification routing module to dynamically orchestrate multimodal understanding and reasoning subtasks; and incorporates a self-reflection mechanism to enhance contextual consistency and interpretability of generated recommendations. Experimental evaluation on 200 manually curated queries demonstrates that MARC significantly outperforms conventional vector-based retrieval approaches in recommendation quality, as assessed jointly by LLM-based and human evaluation. Notably, MARC exhibits superior knowledge generalization capability and reasoning robustness under cold-start conditions.

Technology Category

Application Category

📝 Abstract

Recommender systems (RS) are currently being studied to mitigate limitations during cold-start conditions by leveraging modality information or introducing Agent concepts based on the exceptional reasoning capabilities of Large Language Models (LLMs). Meanwhile, food and beverage recommender systems have traditionally used knowledge graph and ontology concepts due to the domain's unique data attributes and relationship characteristics. On this background, we propose MARC, a multimodal and multi-task cocktail recommender system based on Agentic Retrieval-Augmented Generation (RAG) utilizing graph database under cold-start conditions. The proposed system generates high-quality, contextually appropriate answers through two core processes: a task recognition router and a reflection process. The graph database was constructed by processing cocktail data from Kaggle, and its effectiveness was evaluated using 200 manually crafted questions. The evaluation used both LLM-as-a-judge and human evaluation to demonstrate that answers generated via the graph database outperformed those from a simple vector database in terms of quality. The code is available at https://github.com/diddbwls/cocktail_rec_agentrag

Problem

Research questions and friction points this paper is trying to address.

Addresses cold-start limitations in recommender systems

Leverages multimodal data and agentic RAG architecture

Improves recommendation quality using graph databases

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multimodal agentic RAG with graph database

Implements task recognition router and reflection process

Leverages graph database to enhance answer quality

🔎 Similar Papers

No similar papers found.