🤖 AI Summary
This study addresses the challenge of quantifying urban thermal environments at the street-view scale, which has been hindered by inconsistencies and misalignment among multi-source remote sensing data. For the first time, Landsat, Sentinel-1, and GOES-R multimodal remote sensing observations are harmonized within a unified spatiotemporal grid to construct an AI-ready dataset adhering to FAIR principles, covering 90×90 km regions across 48 cities in the Western Hemisphere from 2022 to 2023. Through reprojection, resampling, and spatiotemporal alignment techniques—augmented by autoencoder-based evaluation of per-pixel class reconstruction errors—the work delivers standardized metadata and ready-to-use data cubes. Experimental validation confirms high data quality, clarifies the applicability and limitations of land cover classes such as water bodies and clouds, and substantially reduces preprocessing burdens for downstream AI modeling.
📝 Abstract
Urban heat is amplified by impermeable surfaces and heterogeneous built environments, yet street-level variability remains difficult to quantify because multi-sensor observations are rarely available in consistent, analysis-ready form at the necessary spatiotemporal scales. We present "Urban Heat MiniCubes," a publicly available, FAIR-oriented dataset designed for machine learning applications in urban heat research. The dataset provides harmonized 90 x 90 km gridded data cubes for 48 cities in the Western Hemisphere spanning 2022-2023, with variables reprojected and collocated to a common grid to reduce preprocessing (e.g., reprojection, resampling, and spatiotemporal alignment). Urban Heat MiniCubes includes two complementary modalities: (i) higher-spatial-resolution, lower-frequency observations from Landsat 8/9 (e.g., surface reflectances) and Sentinel-1 (e.g., synthetic aperture radar backscatter), and (ii) higher-temporal-frequency, coarser observations from GOES-R (e.g., longwave infrared brightness temperatures) and a microwave land surface temperature product. We document variables and metadata and provide technical assessment using inter-variable analyses and autoencoder-based reconstruction-error summaries across pixel classes (e.g., water and cloud). Potential use cases and limitations are also discussed.