SemEval-2025 Task 9: The Food Hazard Detection Challenge

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This work addresses the long-tailed distribution challenge in food hazard text detection. We propose a two-tier classification framework: coarse-grained (10-category hazard identification plus food category classification) and fine-grained (joint hazard–product label prediction). To support this, we introduce and publicly release—under CC BY-NC-SA 4.0 license—the first high-quality, manually annotated benchmark dataset comprising 6,644 samples. We systematically evaluate the efficacy of large language model–generated synthetic data for oversampling under long-tail conditions. Furthermore, we comparatively analyze encoder-only, encoder-decoder, and decoder-only architectures under fine-tuning, finding comparable performance across all three. Our approach integrates synthetic data augmentation with hierarchical classification modeling, achieving state-of-the-art results on both tasks. This work establishes a new benchmark and provides a reproducible technical pipeline for NLP research in food safety.

Technology Category

Application Category

📝 Abstract

In this challenge, we explored text-based food hazard prediction with long tail distributed classes. The task was divided into two subtasks: (1) predicting whether a web text implies one of ten food-hazard categories and identifying the associated food category, and (2) providing a more fine-grained classification by assigning a specific label to both the hazard and the product. Our findings highlight that large language model-generated synthetic data can be highly effective for oversampling long-tail distributions. Furthermore, we find that fine-tuned encoder-only, encoder-decoder, and decoder-only systems achieve comparable maximum performance across both subtasks. During this challenge, we gradually released (under CC BY-NC-SA 4.0) a novel set of 6,644 manually labeled food-incident reports.

Problem

Research questions and friction points this paper is trying to address.

Predict food hazard categories from web texts

Classify hazards and products with fine-grained labels

Use synthetic data for long-tail distribution oversampling

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-generated synthetic data for oversampling

Fine-tuned encoder-decoder systems for classification

Manually labeled food-incident reports dataset

🔎 Similar Papers

No similar papers found.