🤖 AI Summary
Current retrieval of vegetation biophysical parameters (e.g., LAI, CCC) from Sentinel-2 imagery relies heavily on scarce in-situ measurements and suffers from poor generalizability. To address this, we propose a physics-guided Transformer-VAE framework that embeds a differentiable PROSAIL radiative transfer model as the decoder within a deep neural network, enabling end-to-end differentiable inversion. The method is fully self-supervised—trained exclusively on synthetic data without requiring ground-truth labels or image calibration. By imposing physical constraints on the latent variable space, it ensures retrieved parameters retain explicit biophysical interpretability. Evaluated on real-world FRM4Veg and BelSAR datasets, our approach achieves accuracy comparable to state-of-the-art supervised methods. This work provides the first empirical validation of purely simulation-driven inversion in remote sensing, establishing a scalable, calibration-free paradigm for global vegetation monitoring.
📝 Abstract
Accurate retrieval of vegetation biophysical variables from satellite imagery is crucial for ecosystem monitoring and agricultural management. In this work, we propose a physics-informed Transformer-VAE architecture to invert the PROSAIL radiative transfer model for simultaneous estimation of key canopy parameters from Sentinel-2 data. Unlike previous hybrid approaches that require real satellite images for self-supevised training. Our model is trained exclusively on simulated data, yet achieves performance on par with state-of-the-art methods that utilize real imagery. The Transformer-VAE incorporates the PROSAIL model as a differentiable physical decoder, ensuring that inferred latent variables correspond to physically plausible leaf and canopy properties. We demonstrate retrieval of leaf area index (LAI) and canopy chlorophyll content (CCC) on real-world field datasets (FRM4Veg and BelSAR) with accuracy comparable to models trained with real Sentinel-2 data. Our method requires no in-situ labels or calibration on real images, offering a cost-effective and self-supervised solution for global vegetation monitoring. The proposed approach illustrates how integrating physical models with advanced deep networks can improve the inversion of RTMs, opening new prospects for large-scale, physically-constrained remote sensing of vegetation traits.