Evaluating Multilevel Regression and Poststratification with Spatial Priors with a Big Data Behavioural Survey

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study evaluates the performance of multilevel regression and post-stratification (MRP) with a BYM2 spatial prior in estimating county-level first-dose COVID-19 vaccination rates in California (as of June 2021) using a large, non-representative behavioral survey. It represents the first empirical assessment of spatial MRP on real-world, large-scale behavioral data, integrating U.S. American Community Survey (ACS) for post-stratification and benchmarking against official CDC estimates. Results show that both classical and spatial MRP fall short of desired accuracy; the BYM2 prior induces excessive smoothing under data sparsity and severe sampling bias—including oversampling and population aggregation—thereby compromising estimation reliability. The key contribution is identifying the applicability boundaries of spatial MRP: its advantages are contingent upon higher data volume and improved sample representativeness. This finding provides critical empirical guidance for refining MRP methodology and selecting appropriate use cases.

Technology Category

Application Category

📝 Abstract
Multilevel regression and poststratification (MRP) is a computationally efficient indirect estimation method that can quickly produce improved population-adjusted estimates with limited data. Recent computational advancements allow efficient, relatively simple, and quick approximate Bayesian estimation for MRP. As population health outcomes of interest including vaccination uptake are known to have spatial structure, precision may be gained by including space in the model. We test a recently proposed spatial MRP method that includes a BYM2 spatial term that smooths across demographics and geographic areas using a large, unrepresentative survey. We produce California county-level estimates of first-dose COVID-19 vaccination up to June 2021 using classic and spatial MRP models, and poststratify using data from the American Community Survey (US Census Bureau). We assess validity using reported first-dose vaccination counts from the Centers for Disease Control (CDC). Neither classic nor spatial MRP models performed well, highlighting: 1. spatial MRP may be most appropriate for richer data contexts, 2. some demographics in the survey data are over-sampled and -aggregated, producing model over-smoothing, and 3. a need for survey producers to share user-representative metrics to better benchmark estimates.
Problem

Research questions and friction points this paper is trying to address.

Evaluating spatial MRP for COVID-19 vaccination estimates
Assessing validity of classic vs spatial MRP models
Identifying limitations in survey data representativeness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Multilevel Regression and Poststratification (MRP)
Incorporates BYM2 spatial term for smoothing
Leverages American Community Survey for poststratification
🔎 Similar Papers
No similar papers found.
A
Aja Sutton
Department of Environmental Social Sciences, Stanford University, Yang & Yamazaki (Y2E2) Building, 473 Via Ortega, Room 350, 94305, California, United States of America
Z
Zack W. Almquist
Department of Sociology, University of Washington, Savery Hall, Room 211, 4100 Spokane Ln., Seattle, 98195, Washington, United States of America; Department of Statistics, University of Washington, C-138 Padelford Hall, Box 35435, Seattle, 98195, Washington, United States of America
Jon Wakefield
Jon Wakefield
Professor Statistics Biostatistics University of Washington
statisticsbiostatisticsepidemiology