🤖 AI Summary
Spatial non-stationarity in Earth sciences induces pronounced spatial heterogeneity in feature–target relationships, posing challenges for conventional methods to simultaneously capture global patterns and local dynamics. To address this, we propose a dual-branch implicit modeling framework that—novelty—introduces implicit conditional vectors to represent spatiotemporal heterogeneity. The framework integrates self-attention-based shared encoding with graph-convolutional and long short-term memory (GCN+LSTM) modules to generate interpretable, location-specific weights and track continuous spatial evolution. Evaluated on global vegetation gross primary productivity (GPP) prediction (2001–2020), our method achieves an RMSE of 0.836, significantly outperforming LightGBM and TabNet. Visualization explicitly reveals the spatiotemporal differentiation of dominant drivers. By jointly ensuring high accuracy, strong interpretability, and dynamic adaptability, our approach establishes a new paradigm for spatially non-stationary modeling in Earth system science.
📝 Abstract
In Earth sciences, unobserved factors exhibit non-stationary spatial distributions, causing the relationships between features and targets to display spatial heterogeneity. In geographic machine learning tasks, conventional statistical learning methods often struggle to capture spatial heterogeneity, leading to unsatisfactory prediction accuracy and unreliable interpretability. While approaches like Geographically Weighted Regression (GWR) capture local variations, they fall short of uncovering global patterns and tracking the continuous evolution of spatial heterogeneity. Motivated by this limitation, we propose a novel perspective - that is, simultaneously modeling common features across different locations alongside spatial differences using deep neural networks. The proposed method is a dual-branch neural network with an encoder-decoder structure. In the encoding stage, the method aggregates node information in a spatiotemporal conditional graph using GCN and LSTM, encoding location-specific spatiotemporal heterogeneity as an implicit conditional vector. Additionally, a self-attention-based encoder is used to extract location-invariant common features from the data. In the decoding stage, the approach employs a conditional generation strategy that predicts response variables and interpretative weights based on data features under spatiotemporal conditions. The approach is validated by predicting vegetation gross primary productivity (GPP) using global climate and land cover data from 2001 to 2020. Trained on 50 million samples and tested on 2.8 million, the proposed model achieves an RMSE of 0.836, outperforming LightGBM (1.063) and TabNet (0.944). Visualization analyses indicate that our method can reveal the distribution differences of the dominant factors of GPP across various times and locations.