🤖 AI Summary
Existing crop type mapping products struggle to deliver high-accuracy, near-real-time monitoring prior to harvest, limiting timely emergency responses under extreme climatic conditions. This study addresses this gap by integrating Harmonized Landsat-Sentinel surface reflectance time series with historical crop rotation data to systematically evaluate the performance of ten machine learning algorithms for in-season crop mapping. For the first time, a cross-year validation framework is introduced to explicitly account for interannual variability. Results show that Support Vector Machines achieve the highest accuracy, yielding mean F1 scores of 0.74 for almond in California and 0.59 for corn in Iowa at 30-meter resolution by early June. Interannual differences in phenology and spatial distribution are identified as primary sources of uncertainty, while ensemble methods or supplementary data significantly enhance model robustness.
📝 Abstract
In-season crop type mapping is critical for food security in the face of increasingly extreme climate-related threats to crops. Currently, the USDA Cropland Data Layer provides crop type labels at 30m resolution and is available the February after harvest, but no product exists that maps crop types before harvest with satisfactory accuracy that would allow emergency managers to respond to crop threats in near real time. Furthermore, the relative advantages of a wide range of algorithms have not been evaluated in a way that accounts for interannual variability, until this study. Here, Harmonized Landsat-Sentinel surface reflectance imagery time series and crop rotation history information are combined to map corn in Iowa and almonds in California at 30m resolution accurately by early June in unseen years, with robust quantification of uncertainty due to phenology and crop distribution. Thousands of model configurations across ten machine learning algorithms were compared using a year-wise cross-validation and a suite of metrics. Hyperparameter search revealed Support Vector Machines to be the most successful algorithm overall, with a mean F1 score of 0.74 (0.59) across five unseen validation years for almonds by early June in California (corn by early June in Iowa). Interannual variation was a large source of uncertainty, but patterns showed the potential to further improve performance with ensemble approaches or ancillary data. Future work may extend these methods to include multiclass maps of all crop types, CONUS-wide maps, and in-season crop yield forecasting.