Flexible and Efficient Estimation of Causal Effects with Error-Prone Exposures: A Control Variates Approach for Measurement Error

📅 2024-10-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Measurement error in exposure variables commonly biases causal effect estimation in observational studies; existing methods either rely on strong parametric assumptions or lack design flexibility and generalizability. To address this, we propose a two-stage, doubly robust estimation framework leveraging control variables—the first systematic integration of control-variable methodology into causal inference under measurement error. Our approach accommodates diverse two-phase sampling designs (e.g., validation subsample plus full cohort) and requires only weak identification assumptions, while ensuring model flexibility, double robustness, and computational feasibility. Simulation studies demonstrate substantial improvements in finite-sample performance over mainstream correction methods. Applied to Vanderbilt’s HIV electronic health record data, our method yields more accurate and stable causal effect estimates.

Technology Category

Application Category

📝 Abstract
Exposure measurement error is a ubiquitous but often overlooked challenge in causal inference with observational data. Existing methods accounting for exposure measurement error largely rely on restrictive parametric assumptions, while emerging data-adaptive estimation approaches allow for less restrictive assumptions but at the cost of flexibility, as they are typically tailored towards rigidly-defined statistical quantities. There remains a critical need for assumption-lean estimation methods that are both flexible and possess desirable theoretical properties across a variety of study designs. In this paper, we introduce a general framework for estimation of causal quantities in the presence of exposure measurement error, adapted from the control variates approach of Yang and Ding (2019). Our method can be implemented in various two-phase sampling study designs, where one obtains gold-standard exposure measurements for a small subset of the full study sample, called the validation data. The control variates framework leverages both the error-prone and error-free exposure measurements by augmenting an initial consistent estimator from the validation data with a variance reduction term formed from the full data. We show that our method inherits double-robustness properties under standard causal assumptions. Simulation studies show that our approach performs favorably compared to leading methods under various two-phase sampling schemes. We illustrate our method with observational electronic health record data on HIV outcomes from the Vanderbilt Comprehensive Care Clinic.
Problem

Research questions and friction points this paper is trying to address.

Estimating causal effects with error-prone exposure measurements
Addressing restrictive parametric assumptions in existing methods
Providing flexible, assumption-lean estimation for various study designs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Control variates approach for measurement error
Two-phase sampling with validation data
Double-robustness under causal assumptions
🔎 Similar Papers
No similar papers found.
K
Keith Barnatchez
Department of Biostatistics, Harvard T.H. Chan School of Public Health
R
Rachel Nethery
Department of Biostatistics, Harvard T.H. Chan School of Public Health
Bryan E. Shepherd
Bryan E. Shepherd
Professor of Biostatistics, Vanderbilt University
Giovanni Parmigiani
Giovanni Parmigiani
Professor Department of Data Science, DFCI
Applied StatisticsBayesian StatisticsCancer PreventionCancer Genetics/Genomics
K
K. Josey
Department of Biostatistics and Informatics, Colorado School of Public Health