On Ignorability of Preferential Sampling in Geostatistics

📅 2025-11-05

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This paper addresses estimation bias induced by preferential sampling in geostatistics, investigating conditions under which ignoring the sampling mechanism remains valid within the Diggle et al. (2010) framework. Moving beyond computationally intensive and model-misspecification-sensitive likelihood-based approaches, we derive—through theoretical analysis and simulation studies—sufficient conditions for non-likelihood estimators (e.g., weighted least squares, method-of-moments estimators) to retain unbiasedness and consistency: specifically, when covariates are orthogonal to spatial residuals and satisfy certain mixing conditions. Under these conditions, explicit modeling of the sampling mechanism becomes unnecessary for robust inference. The proposed approach substantially reduces computational cost, improves confidence interval coverage, and demonstrates empirical validity on tropical forest carbon stock data. It establishes a new paradigm for lightweight, model-agnostic estimation under preferential sampling.

Technology Category

Application Category

📝 Abstract

Preferential sampling has attracted considerable attention in geostatistics since the pioneering work of Diggle et al. (2010). A variety of likelihood-based approaches have been developed to correct estimation bias by explicitly modelling the sampling mechanism. While effective in many applications, these methods are often computationally expensive and can be susceptible to model misspecification. In this paper, we present a surprising finding: some existing non-likelihood-based methods that ignore preferential sampling can still produce unbiased and consistent estimators under the widely used framework of Diggle et al. (2010) and its extensions. We investigate the conditions under which preferential sampling can be ignored and develop relevant estimators for both regression and covariance parameters without specifying the sampling mechanism parametrically. Simulation studies demonstrate clear advantages of our approach, including reduced estimation error, improved confidence interval coverage, and substantially lower computational cost. To show the practical utility, we further apply it to a tropical forest data set.

Problem

Research questions and friction points this paper is trying to address.

Investigates when preferential sampling can be ignored in geostatistics

Develops unbiased estimators without parametric sampling mechanism specification

Addresses computational cost and model misspecification limitations of existing methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ignoring preferential sampling under certain conditions

Developing unbiased estimators without parametric sampling models

Reducing computational cost while maintaining estimation accuracy

🔎 Similar Papers

Sample Selection Bias in Machine Learning for Healthcare