FungiTastic: A multi-modal dataset and benchmark for image categorization

📅 2024-08-24
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limited cross-scenario generalizability in fungal image classification. We introduce Fungi350K—the first large-scale, multimodal, DNA-validated fungal observation benchmark—comprising 350,000 field-collected samples across 6,000 fine-grained species, integrated with images, meteorological data, satellite imagery, and segmentation masks. Crucially, we pioneer molecular-evidence-driven evaluation by using DNA sequencing labels as high-confidence ground truth for the test set—a first in biological classification benchmarks. Fungi350K supports unified evaluation across closed-set and open-set recognition, multimodal learning, few-shot learning, and domain shift scenarios. We propose a novel recognition paradigm synergizing expert annotations with heterogeneous environmental modalities, and publicly release the benchmark framework, Hugging Face–compatible pretrained models, and task-specific baselines. Experiments demonstrate substantial improvements in cross-modal generalization and few-shot accuracy, advancing standardization and reproducibility in AI-driven biodiversity research.

Technology Category

Application Category

📝 Abstract
We introduce a new, challenging benchmark and a dataset, FungiTastic, based on fungal records continuously collected over a twenty-year span. The dataset is labelled and curated by experts and consists of about 350k multimodal observations of 6k fine-grained categories (species). The fungi observations include photographs and additional data, e.g., meteorological and climatic data, satellite images, and body part segmentation masks. FungiTastic is one of the few benchmarks that include a test set with DNA-sequenced ground truth of unprecedented label reliability. The benchmark is designed to support (i) standard closed-set classification, (ii) open-set classification, (iii) multi-modal classification, (iv) few-shot learning, (v) domain shift, and many more. We provide tailored baselines for many use cases, a multitude of ready-to-use pre-trained models on https://huggingface.co/collections/BVRA/fungitastic-66a227ce0520be533dc6403b, and a framework for model training. The documentation and the baselines are available at https://github.com/BohemianVRA/FungiTastic/ and https://www.kaggle.com/datasets/picekl/fungitastic.
Problem

Research questions and friction points this paper is trying to address.

Classifying fine-grained fungal species using multimodal data
Addressing open-set and few-shot learning challenges in fungi categorization
Providing DNA-sequenced ground truth for reliable fungal image classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal dataset with 350k fungal observations
Includes DNA-sequenced ground truth test set
Pre-trained models and training framework provided
🔎 Similar Papers
Lukáš Picek
Lukáš Picek
INRIA & University of West Bohemia
Computer Vision
K
Klára Janousková
CTU in Prague, and Second Foundation
M
Milan Šulc
University of West Bohemia & INRIA
J
Jirí Matas
CTU in Prague, and Second Foundation