Biomazon: A Multimodal Dataset for 3D Forest Structure and Biomass Modeling in the Amazon Basin

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This study addresses the limitation of existing forest remote sensing approaches, which typically predict canopy height or aboveground biomass independently, thereby neglecting holistic modeling of vertical structure and physical consistency. We present the first 20-meter multimodal benchmark dataset tailored for the Amazon basin, integrating GEDI vertical profiles with aboveground biomass labels and diverse remote sensing data—including Sentinel-1/2, ALOS-2 PALSAR-2, Copernicus DEM, Dynamic World land use/land cover, and AlphaEarth embeddings—under a unified spatial partitioning and evaluation protocol. Our model employs a shared encoder–decoder architecture with task-specific heads to enable structurally consistent, joint three-dimensional forest prediction. Systematic ablation studies quantify the contributions of individual modalities, model scale, and fusion strategies, demonstrating that our benchmark results match or exceed those of current gridded products.
📝 Abstract
Accurate, spatially explicit characterization of tropical forest structure is essential for carbon accounting and ecosystem monitoring, yet most ML pipelines predict canopy-top height proxies (e.g., RH95/RH98) or AGBD as separate scalar targets, rather than learning the forest vertical structure as an ordered profile. The community lacks a ML-ready multimodal benchmark for predicting the entire GEDI RH profile jointly with AGBD, or for evaluating methods that enforce physically consistent ordering across RH percentiles. We address this with Biomazon, a 20 m multimodal benchmark dataset over the Amazon Basin that pairs GEDI RH and AGBD targets with multi-sensor predictors (Sentinel-1/2, ALOS-2 PALSAR-2, Copernicus DEM, Dynamic World LULC, and AlphaEarth embeddings) under standardized spatial splits and evaluation protocols. Using a shared encoder-decoder with task-specific heads as a baseline framework, we conduct a comprehensive ablation study of (i) backbone/model scale, (ii) modality contributions, and (iii) the use of auxiliary embeddings under standalone and fusion settings, and we report both single-target and joint-target results to quantify tradeoffs under a unified training protocol. Finally, we contextualize baseline performance through regionally aligned comparisons against existing gridded products, including GEDI L4D RH10-RH98 and AGBD, at matching temporal scale. Biomazon, together with the accompanying protocols and baseline results, establishes a reference benchmark for future work on structurally consistent RH-profile prediction and structure-biomass modeling in tropical forests.
Problem

Research questions and friction points this paper is trying to address.

forest vertical structure
biomass modeling
GEDI RH profile
multimodal dataset
tropical forests
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal benchmark
vertical forest structure
joint RH-AGBD modeling
physically consistent ordering
Amazon Basin
S
Sayan Mandal
Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich, 52428 Jülich, Germany and School of Engineering and Natural Sciences (SENS), University of Iceland, 102 Reykjavík, Iceland
R
Rocco Sedona
Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich, 52428 Jülich, Germany
S
Simon Besnard
Global Land Monitoring Group, GFZ Helmholtz Centre for Geosciences, Potsdam, Germany
M
Mikhail Urbazaev
Global Land Monitoring Group, GFZ Helmholtz Centre for Geosciences, Potsdam, Germany
M
Morris Riedel
Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich, 52428 Jülich, Germany and School of Engineering and Natural Sciences (SENS), University of Iceland, 102 Reykjavík, Iceland
E
Ehsan Zandi
Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich, 52428 Jülich, Germany
Gabriele Cavallaro
Gabriele Cavallaro
Forschungszentrum Jülich and University of Iceland
Remote SensingMachine LearningHigh Performance ComputingQuantum Computing