Causal Effect Estimation with TMLE: Handling Missing Data and Near-Violations of Positivity

📅 2025-10-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study evaluates the robustness of Targeted Maximum Likelihood Estimation (TMLE) for estimating the Average Treatment Effect (ATE) under missing data and varying degrees of positivity violation. Using model- and design-based simulations—augmented with real-world data from the WASH Benefits Bangladesh trial—we systematically compare Complete-Case Analysis (CCA), Multiple Imputation (MI), and non-MI strategies across missingness mechanisms (MCAR, MAR, MNAR) and machine learning tools (e.g., Highly Adaptive Lasso [HAL], Classification and Regression Trees [CART]). We propose a novel “CCA+TMLE” framework that integrates an outcome missingness model within a non-MI setting, markedly reducing bias and enhancing robustness to positivity violations, while uncovering a fundamental bias–coverage trade-off. Results show CCA+TMLE achieves the lowest bias; CART-based MI yields the best root mean squared error (RMSE) and confidence interval coverage.

Technology Category

Application Category

📝 Abstract
We evaluate the performance of targeted maximum likelihood estimation (TMLE) for estimating the average treatment effect in missing data scenarios under varying levels of positivity violations. We employ model- and design-based simulations, with the latter using undersmoothed highly adaptive lasso on the 'WASH Benefits Bangladesh' dataset to mimic real-world complexities. Five missingness-directed acyclic graphs are considered, capturing common missing data mechanisms in epidemiological research, particularly in one-point exposure studies. These mechanisms include also not-at-random missingness in the exposure, outcome, and confounders. We compare eight missing data methods in conjunction with TMLE as the analysis method, distinguishing between non-multiple imputation (non-MI) and multiple imputation (MI) approaches. The MI approaches use both parametric and machine-learning models. Results show that non-MI methods, particularly complete cases with TMLE incorporating an outcome-missingness model, exhibit lower bias compared to all other evaluated missing data methods and greater robustness against positivity violations across. In Comparison MI with classification and regression trees (CART) achieve lower root mean squared error, while often maintaining nominal coverage rates. Our findings highlight the trade-offs between bias and coverage, and we recommend using complete cases with TMLE incorporating an outcome-missingness model for bias reduction and MI CART when accurate confidence intervals are the priority.
Problem

Research questions and friction points this paper is trying to address.

Estimating causal effects with TMLE under missing data conditions
Addressing positivity violations in observational study designs
Comparing imputation methods for exposure, outcome, and confounder missingness
Innovation

Methods, ideas, or system contributions that make the work stand out.

TMLE handles missing data with outcome-missingness model
Uses highly adaptive lasso for real-world complexity simulation
Compares multiple imputation with machine learning methods
🔎 Similar Papers
No similar papers found.
C
Christoph Wiederkehr
Department of Statistics, Ludwig-Maximilians University Munich, Munich, Germany
Christian Heumann
Christian Heumann
Professor Statistik, Ludwig-Maximilians-Universität München
Statistik
M
Michael Schomaker
Department of Statistics, Ludwig-Maximilians University Munich, Munich, Germany