Retrieval-Augmented Multi-scale Framework for County-Level Crop Yield Prediction Across Large Regions

📅 2026-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing crop yield prediction methods often suffer from performance degradation at large spatial and long temporal scales, struggling to simultaneously capture short- and long-term temporal dynamics while being constrained by spatial heterogeneity. To address these challenges, this work proposes a novel framework that integrates multi-scale temporal modeling with a retrieval-augmented mechanism. The approach employs a new backbone architecture to jointly capture daily-scale and interannual growth patterns and introduces a debiased retrieval–refinement pipeline to correct cross-year data biases and enhance cross-region generalization. Evaluated on maize yield prediction across 630 U.S. counties, the method significantly outperforms multiple baselines, demonstrating superior robustness and predictive accuracy in complex, spatially heterogeneous environments.

Technology Category

Application Category

📝 Abstract
This paper proposes a new method for crop yield prediction, which is essential for developing management strategies, informing insurance assessments, and ensuring long-term food security. Although existing data-driven approaches have shown promise in this domain, their performance often degrades when applied across large geographic regions and long time periods. This limitation arises from two key challenges: (1) difficulty in jointly capturing short-term and long-term temporal patterns, and (2) inability to effectively accommodate spatial data variability in agricultural systems. Ignoring these issues often leads to unreliable predictions for specific regions or years, which ultimately affects policy decisions and resource allocation. In this paper, we propose a new predictive framework to address these challenges. First, we introduce a new backbone model architecture that captures both short-term daily-scale crop growth dynamics and long-term dependencies across years. To further improve generalization across diverse spatial regions, we augment this model with a retrieval-based adaptation strategy. Recognizing the substantial yield variation across years, we design a novel retrieval-and-refinement pipeline that adjusts retrieved samples by removing cross-year bias not explained by input features. Our experiments on real-world county-level corn yield data over 630 counties in the US demonstrate that our method consistently outperforms different types of baselines. The results also verify the effectiveness of the retrieval-based augmentation method in improving model robustness under spatial heterogeneity.
Problem

Research questions and friction points this paper is trying to address.

crop yield prediction
spatial heterogeneity
temporal patterns
large-scale prediction
agricultural systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

retrieval-augmented
multi-scale modeling
spatial heterogeneity
crop yield prediction
temporal dynamics
🔎 Similar Papers
No similar papers found.