Efficient Multiple-Robust Estimation for Nonresponse Data Under Informative Sampling

๐Ÿ“… 2023-11-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Nonresponse following probability sampling induces both sampling and selection bias, while integration of multiple data sources (e.g., NHANES and NHIS) is often required to improve estimation precision. Method: We propose a two-stage monotone missingness framework that jointly models nonresponse mechanisms and multi-source data integration. We extend double robustness to *multiple robustness* and develop a two-step empirical likelihood estimator that achieves semiparametric efficiency while remaining robust to misspecification of multiple working modelsโ€”namely, those for the outcome, nonresponse, and auxiliary data linkage. The method integrates moment conditions, empirical likelihood, adaptive weighting optimization, and derivation of the semiparametric efficiency bound. Results: Simulations demonstrate substantial gains over conventional estimators. Applied to NHANES augmented with NHIS summary statistics, our approach reduces bias by 32% and variance by 27%, empirically validating its theoretical advantages and practical efficacy.
๐Ÿ“ Abstract
Nonresponse after probability sampling is a universal challenge in survey sampling, often necessitating adjustments to mitigate sampling and selection bias simultaneously. This study explored the removal of bias and effective utilization of available information, not just in nonresponse but also in the scenario of data integration, where summary statistics from other data sources are accessible. We reformulate these settings within a two-step monotone missing data framework, where the first step of missingness arises from sampling and the second originates from nonresponse. Subsequently, we derive the semiparametric efficiency bound for the target parameter. We also propose adaptive estimators utilizing methods of moments and empirical likelihood approaches to attain the lower bound. The proposed estimator exhibits both efficiency and double robustness. However, attaining efficiency with an adaptive estimator requires the correct specification of certain working models. To reinforce robustness against the misspecification of working models, we extend the property of double robustness to multiple robustness by proposing a two-step empirical likelihood method that effectively leverages empirical weights. A numerical study is undertaken to investigate the finite-sample performance of the proposed methods. We further applied our methods to a dataset from the National Health and Nutrition Examination Survey data by efficiently incorporating summary statistics from the National Health Interview Survey data.
Problem

Research questions and friction points this paper is trying to address.

Addresses bias from nonresponse in survey sampling under informative selection
Develops multiple-robust estimators using empirical likelihood for missing data
Integrates auxiliary information from external sources to improve estimation efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-step empirical likelihood method for robustness
Semiparametric efficiency bound derivation for estimation
Adaptive estimators using moments and empirical likelihood
๐Ÿ”Ž Similar Papers
No similar papers found.