CGU-ILALab at FoodBench-QA 2026: Comparing Traditional and LLM-based Approaches for Recipe Nutrient Estimation

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Accurately estimating nutritional content from unstructured recipe texts is challenging due to ambiguous ingredient terminology and highly variable quantity expressions. This study systematically evaluates a spectrum of approaches—from lexical matching using TF-IDF with ridge regression and deep semantic encoders (DeBERTa-v3) to generative reasoning with large language models (LLMs)—and proposes a hybrid LLM refinement pipeline that combines TF-IDF-based retrieval with few-shot inference via Gemini 2.5 Flash. Leveraging the LLM’s world knowledge, this pipeline effectively resolves ambiguous terms and non-standard units. Evaluated under strict compliance with European Union nutritional labeling standards, the hybrid method achieves the highest validation accuracy across all nutrient categories, significantly outperforming conventional techniques, while also revealing a practical trade-off between accuracy gains and increased inference latency.

📝 Abstract

Accurate nutrient estimation from unstructured recipe text is an important yet challenging problem in dietary monitoring, due to ambiguous ingredient terminology and highly variable quantity expressions. We systematically evaluate models spanning a wide range of representational capacity, from lexical matching methods (TF-IDF with Ridge Regression), to deep semantic encoders (DeBERTa-v3), to generative reasoning with large language models (LLMs). Under the strict tolerance criteria defined by EU Regulation 1169/2011, our empirical results reveal a clear trade-off between predictive accuracy and computational efficiency. The TF-IDF baseline achieves moderate nutrient estimation performance with near-instantaneous inference, whereas the DeBERTa-v3 encoder performs poorly under task-specific data scarcity. In contrast, few-shot LLM inference (e.g., Gemini 2.5 Flash) and a hybrid LLM refinement pipeline (TF-IDF combined with Gemini 2.5 Flash) deliver the highest validation accuracy across all nutrient categories. These improvements likely arise from the ability of LLMs to leverage pre-trained world knowledge to resolve ambiguous terminology and normalize non-standard units, which remain difficult for purely lexical approaches. However, these gains come at the cost of substantially higher inference latency, highlighting a practical deployment trade-off between real-time efficiency and nutritional precision in dietary monitoring systems.

Problem

Research questions and friction points this paper is trying to address.

nutrient estimation

recipe text

ambiguous terminology

quantity expressions

dietary monitoring

Innovation

Methods, ideas, or system contributions that make the work stand out.

nutrient estimation

large language models

hybrid pipeline