Evaluation of LLMs in retrieving food and nutritional context for RAG systems

📅 2026-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of lowering the technical barrier for food and nutrition experts in using complex data systems by leveraging large language models (LLMs) for efficient and accurate data retrieval. We propose a retrieval-augmented generation (RAG)-based system that systematically evaluates, for the first time, the ability of four LLMs to translate natural language queries into structured metadata filters, integrated with a Chroma vector database for retrieval. Experimental results demonstrate high accuracy and minimal human intervention for simple to moderately complex queries, confirming the feasibility of LLMs in enabling effective retrieval under expressible constraints. However, reliability remains a challenge for complex queries involving non-expressible constraints. This work establishes a new paradigm and empirical foundation for domain-specific retrieval systems.

Technology Category

Application Category

📝 Abstract
In this article, we evaluate four Large Language Models (LLMs) and their effectiveness at retrieving data within a specialized Retrieval-Augmented Generation (RAG) system, using a comprehensive food composition database. Our method is focused on the LLMs ability to translate natural language queries into structured metadata filters, enabling efficient retrieval via a Chroma vector database. By achieving high accuracy in this critical retrieval step, we demonstrate that LLMs can serve as an accessible, high-performance tool, drastically reducing the manual effort and technical expertise previously required for domain experts, such as food compilers and nutritionists, to leverage complex food and nutrition data. However, despite the high performance on easy and moderately complex queries, our analysis of difficult questions reveals that reliable retrieval remains challenging when queries involve non-expressible constraints. These findings demonstrate that LLM-driven metadata filtering excels when constraints can be explicitly expressed, but struggles when queries exceed the representational scope of the metadata format.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Retrieval-Augmented Generation
food composition
metadata filtering
nutritional data retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models
Retrieval-Augmented Generation
metadata filtering
food composition database
natural language to structured query
🔎 Similar Papers
No similar papers found.
M
Maks Požarnik Vavken
Računalniški sistemi, Institute "Jožef Stefan", Ljubljana, Slovenija
M
Matevž Ogrinc
Računalniški sistemi, Institute "Jožef Stefan", Ljubljana, Slovenija
Tome Eftimov
Tome Eftimov
Computer Systems Department, Jožef Stefan Institute
StatisticsStochastic Optimization AlgorithmsMachine learningNatural Language Processing
B
Barbara Koroušić Seljak
Računalniški sistemi, Institute "Jožef Stefan", Ljubljana, Slovenija