Efficient Targeted Maximum Likelihood Estimators for Two-Phase Design Problems

📅 2026-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of coarsened data arising in two-stage sampling when only a subset of variables is observed in the second stage. Under the assumption that the outcome variable is fully observed, the authors propose a class of novel estimators based on targeted maximum likelihood estimation (TMLE). This approach provides a unified framework for modeling the second-stage sampling mechanism, encompassing generalized calibration estimation, inverse probability of censoring weighted TMLE (IPCW-TMLE), and their extensions. The proposed estimators possess double robustness and achieve higher efficiency, with theoretical analysis demonstrating that they attain the semiparametric efficiency bound asymptotically—matching the best-known performance in the literature—and thereby substantially improving the precision of parameter estimation.

Technology Category

Application Category

📝 Abstract
In a typical two-phase design, a random sample is drawn from the target population in phase 1, during which only a subset of variables is collected. In phase 2, a subsample of the phase-1 cohort is selected, and additional variables are measured. This setting induces a coarsened data structure on the data from the second phase. We assume coarsening at random, that is, the phase-2 sampling mechanism depends only on variables fully observed. We review existing estimators, including the generalized raking estimator and the inverse probability of censoring weighted targeted maximum likelihood estimation (IPCW-TMLE) along with its extensions that also target the phase-2 sampling mechanism to improve efficiency. We further introduce a new class of estimators constructed within the TMLE framework that are asymptotically equivalent.
Problem

Research questions and friction points this paper is trying to address.

two-phase design
coarsened data
targeted maximum likelihood estimation
efficient estimation
sampling mechanism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Targeted Maximum Likelihood Estimation
Two-Phase Sampling
Efficiency
Coarsening at Random
Asymptotic Equivalence
🔎 Similar Papers
No similar papers found.
S
Sky Qiu
Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA; Targeted ML Solutions Inc., Cambridge, MA, USA
S
Susan Gruber
Targeted ML Solutions Inc., Cambridge, MA, USA
Pamela A. Shaw
Pamela A. Shaw
Kaiser Permanente Washington Health Research Institute
BiostatisticsMeasurement ErrorSurvival AnalysisClinical TrialsNutritional and Physical Epidemiology
B
Brian D. Williamson
Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA; Department of Biostatistics, University of Washington, Seattle, WA, USA
M
Mark J. van der Laan
Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA