HyBiomass: Global Hyperspectral Imagery Benchmark Dataset for Evaluating Geospatial Foundation Models in Forest Aboveground Biomass Estimation

📅 2025-06-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing remote sensing benchmarks predominantly focus on multi-focus regional classification/segmentation tasks, lacking standardized datasets for pixel-level regression evaluation of global-scale forest aboveground biomass (AGB). To address this gap, we introduce the first continental-scale regression benchmark—spanning all seven continents—that integrates EnMAP hyperspectral imagery with GEDI lidar-derived AGB maps. This benchmark uniquely bridges both geographical coverage and task modality (i.e., regression) deficiencies. We propose a comparative evaluation framework based on Vision Transformers and U-Net architectures, revealing for the first time the critical impact of token patch size on pixel-level regression performance. Experimental results demonstrate that geospatial foundation models (Geo-FMs), after fine-tuning with limited labeled data, achieve performance comparable to or surpassing that of U-Net. The dataset and code will be publicly released to support generalization analysis and fair benchmarking of geospatial foundation models.

Technology Category

Application Category

📝 Abstract

Comprehensive evaluation of geospatial foundation models (Geo-FMs) requires benchmarking across diverse tasks, sensors, and geographic regions. However, most existing benchmark datasets are limited to segmentation or classification tasks, and focus on specific geographic areas. To address this gap, we introduce a globally distributed dataset for forest aboveground biomass (AGB) estimation, a pixel-wise regression task. This benchmark dataset combines co-located hyperspectral imagery (HSI) from the Environmental Mapping and Analysis Program (EnMAP) satellite and predictions of AGB density estimates derived from the Global Ecosystem Dynamics Investigation lidars, covering seven continental regions. Our experimental results on this dataset demonstrate that the evaluated Geo-FMs can match or, in some cases, surpass the performance of a baseline U-Net, especially when fine-tuning the encoder. We also find that the performance difference between the U-Net and Geo-FMs depends on the dataset size for each region and highlight the importance of the token patch size in the Vision Transformer backbone for accurate predictions in pixel-wise regression tasks. By releasing this globally distributed hyperspectral benchmark dataset, we aim to facilitate the development and evaluation of Geo-FMs for HSI applications. Leveraging this dataset additionally enables research into geographic bias and generalization capacity of Geo-FMs. The dataset and source code will be made publicly available.

Problem

Research questions and friction points this paper is trying to address.

Evaluate geospatial foundation models for global forest biomass estimation

Address lack of diverse benchmark datasets for pixel-wise regression tasks

Assess geographic bias and generalization in hyperspectral imagery applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Globally distributed hyperspectral benchmark dataset

Combines EnMAP HSI and GEDI lidar AGB

Evaluates Geo-FMs with pixel-wise regression

🔎 Similar Papers

No similar papers found.

Authors to Follow