Overview of GeoLifeCLEF 2023: Species Composition Prediction with High Spatial Resolution at Continental Scale Using Remote Sensing

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This study addresses the challenge of continental-scale plant species composition prediction. We propose a deep learning framework that jointly leverages single-label and multi-label training data, integrating high-resolution Sentinel-2 satellite imagery with multi-source environmental covariates—including land cover, climate, soil properties, elevation, and human footprint—to model multi-label species assemblages across 22,000 standardized European vegetation plots. Our key innovation is a dual-path training strategy designed to mitigate systematic bias arising from evaluating multi-label predictions under single-label supervision, thereby substantially improving model generalizability and spatial consistency in real ecological settings. Experiments demonstrate that our approach outperforms state-of-the-art methods in predictive accuracy, ecological plausibility, and cross-regional transferability. The framework establishes a scalable, remote sensing–driven paradigm for large-scale biodiversity monitoring.

Technology Category

Application Category

📝 Abstract

Understanding the spatio-temporal distribution of species is a cornerstone of ecology and conservation. By pairing species observations with geographic and environmental predictors, researchers can model the relationship between an environment and the species which may be found there. To advance the state- of-the-art in this area with deep learning models and remote sensing data, we organized an open machine learning challenge called GeoLifeCLEF 2023. The training dataset comprised 5 million plant species observations (single positive label per sample) distributed across Europe and covering most of its flora, high-resolution rasters: remote sensing imagery, land cover, elevation, in addition to coarse-resolution data: climate, soil and human footprint variables. In this multi-label classification task, we evaluated models ability to predict the species composition in 22 thousand small plots based on standardized surveys. This paper presents an overview of the competition, synthesizes the approaches used by the participating teams, and analyzes the main results. In particular, we highlight the biases faced by the methods fitted to single positive labels when it comes to the multi-label evaluation, and the new and effective learning strategy combining single and multi-label data in training.

Problem

Research questions and friction points this paper is trying to address.

Predicting species composition using remote sensing data

Modeling plant distribution across Europe with machine learning

Addressing single-label bias in multi-label classification tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using remote sensing data for species prediction

Multi-label classification with single positive labels

Combining single and multi-label data in training

🔎 Similar Papers

FoMo: Multi-Modal, Multi-Scale and Multi-Task Remote Sensing Foundation Models for Forest Monitoring