Synthetic Data Generation With Incomplete Survey Data Under Informative Sampling

📅 2026-05-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This study addresses the underestimation of variance and inferential bias in synthetic data arising from informative sampling and missing data in complex surveys. The authors propose a Bayesian synthesis framework that simultaneously imputes missing values and generates synthetic data through an adaptive weighting mechanism. By integrating principles of informative sampling theory within a Bayesian modeling paradigm, the method ensures consistent parameter estimation while yielding an asymptotically efficient Godambe information matrix. This overcomes the systematic underestimation of uncertainty inherent in conventional Bayesian synthesis approaches. Simulation studies demonstrate that the proposed method accurately quantifies uncertainty for both model parameters and population-level inferences, substantially enhancing the statistical reliability of synthetic datasets.
📝 Abstract
We propose a Bayesian framework for data synthesis and imputation in complex survey settings with informative sampling. To address variance underestimation in existing Bayesian approaches and to accommodate the missing data encountered in survey data, we introduce an adaptive weighting scheme for parameter estimation. We show that the proposed weighting yields consistent estimators with an asymptotically valid Godambe information matrix. The framework is flexible, accommodating a broad class of Bayesian models and facilitating practical implementation. Simulation studies demonstrate that the proposed method provides accurate uncertainty quantification for both model parameters and synthetic population inference.
Problem

Research questions and friction points this paper is trying to address.

Synthetic Data Generation
Informative Sampling
Missing Data
Bayesian Framework
Uncertainty Quantification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian synthesis
informative sampling
adaptive weighting
Godambe information
missing data imputation
🔎 Similar Papers