🤖 AI Summary
This study investigates how geocoding strategies affect the performance and cross-regional generalizability of dynamic spatiotemporal PM₂.₅ remote sensing retrieval models. Focusing on daily-scale, high-resolution surface PM₂.₅ estimation, we systematically compare three approaches: no geographic information, raw coordinates, and pretrained geospatial encoders (e.g., GeoCLIP). We provide the first empirical validation in dynamic spatiotemporal settings that GeoCLIP jointly improves both intra- and cross-regional accuracy (R² gains of 0.03–0.07), while raw coordinate interpolation—though beneficial for local fitting—harms generalization. We propose a multi-source fusion modeling framework integrating satellite imagery and ground measurements, coupled with a dual evaluation paradigm assessing both in-domain and out-of-domain performance. Our analysis reveals strong task sensitivity of geocoding encoders: high-order basis functions and sparse sampling induce localized artifacts, underscoring both the necessity and challenges of incorporating robust geographic priors into deep spatiotemporal modeling.
📝 Abstract
Deep learning models have demonstrated success in geospatial applications, yet quantifying the role of geolocation information in enhancing model performance and geographic generalizability remains underexplored. A new generation of location encoders have emerged with the goal of capturing attributes present at any given location for downstream use in predictive modeling. Being a nascent area of research, their evaluation has remained largely limited to static tasks such as species distributions or average temperature mapping. In this paper, we discuss and quantify the impact of incorporating geolocation into deep learning for a real-world application domain that is characteristically dynamic (with fast temporal change) and spatially heterogeneous at high resolutions: estimating surface-level daily PM2.5 levels using remotely sensed and ground-level data. We build on a recently published deep learning-based PM2.5 estimation model that achieves state-of-the-art performance on data observed in the contiguous United States. We examine three approaches for incorporating geolocation: excluding geolocation as a baseline, using raw geographic coordinates, and leveraging pretrained location encoders. We evaluate each approach under within-region (WR) and out-of-region (OoR) evaluation scenarios. Aggregate performance metrics indicate that while na""ive incorporation of raw geographic coordinates improves within-region performance by retaining the interpolative value of geographic location, it can hinder generalizability across regions. In contrast, pretrained location encoders like GeoCLIP enhance predictive performance and geographic generalizability for both WR and OoR scenarios. However, qualitative analysis reveals artifact patterns caused by high-degree basis functions and sparse upstream samples in certain areas, and ablation results indicate varying performance among location encoders...