Kriging prior Regression: A Case for Kriging-Based Spatial Features with TabPFN in Soil Mapping

📅 2025-09-11
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of machine learning methods—lacking explicit spatial structure modeling—and geostatistical approaches—struggling to capture complex environmental covariate relationships—in soil property spatial prediction, this paper proposes Kriging-prior Regression (KpR). KpR leverages ordinary kriging to generate spatial lag features as inputs to a machine learning model, effectively inverting regression kriging logic by explicitly embedding geostatistical spatial dependence into the TabPFN architecture. The method significantly improves both predictive accuracy and reliability of uncertainty quantification under small-sample and weak environment–soil association conditions. Evaluated across six field-scale datasets, KpR achieves an average R² improvement of approximately 30% over conventional methods and consistently outperforms both classical spatial (e.g., regression kriging) and non-spatial models (e.g., RF, XGBoost). This work establishes a novel, interpretable, robust, and data-efficient paradigm for digital soil mapping through principled integration of geostatistics and modern machine learning.

Technology Category

Application Category

📝 Abstract
Machine learning and geostatistics are two fundamentally different frameworks for predicting and spatially mapping soil properties. Geostatistics leverages the spatial structure of soil properties, while machine learning captures the relationship between available environmental features and soil properties. We propose a hybrid framework that enriches ML with spatial context through engineering of 'spatial lag' features from ordinary kriging. We call this approach 'kriging prior regression' (KpR), as it follows the inverse logic of regression kriging. To evaluate this approach, we assessed both the point and probabilistic prediction performance of KpR, using the TabPFN model across six fieldscale datasets from LimeSoDa. These datasets included soil organic carbon, clay content, and pH, along with features derived from remote sensing and in-situ proximal soil sensing. KpR with TabPFN demonstrated reliable uncertainty estimates and more accurate predictions in comparison to several other spatial techniques (e.g., regression/residual kriging with TabPFN), as well as to established non-spatial machine learning algorithms (e.g., random forest). Most notably, it significantly improved the average R2 by around 30% compared to machine learning algorithms without spatial context. This improvement was due to the strong prediction performance of the TabPFN algorithm itself and the complementary spatial information provided by KpR features. TabPFN is particularly effective for prediction tasks with small sample sizes, common in precision agriculture, whereas KpR can compensate for weak relationships between sensing features and soil properties when proximal soil sensing data are limited. Hence, we conclude that KpR with TabPFN is a very robust and versatile modelling framework for digital soil mapping in precision agriculture.
Problem

Research questions and friction points this paper is trying to address.

Combining machine learning and geostatistics for soil mapping
Improving prediction accuracy with spatial lag features
Addressing small sample sizes in precision agriculture
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid framework combining geostatistics and machine learning
Spatial lag features from ordinary kriging integration
TabPFN model for small sample size prediction
🔎 Similar Papers
No similar papers found.