🤖 AI Summary
Estimating the average treatment effect (ATE) under differential privacy from privacy-sensitive observational data—e.g., healthcare or economic records—faces challenges including strong modeling assumptions, privacy cost scaling with estimator complexity, and limited methodological flexibility. This paper proposes the first general framework that decouples nuisance parameter estimation from privacy protection via folded-splitting and ensemble prediction perturbation. Without assuming a specific data-generating mechanism or parametric model form, it delivers unified differential privacy guarantees for diverse ATE estimators—including G-formula, inverse probability weighting (IPW), and augmented IPW (AIPW). The framework further extends to differentially private meta-analysis of ATEs across multiple private data sources. We establish rigorous differential privacy and statistical efficiency guarantees. Empirical evaluation demonstrates that, under realistic privacy budgets, our method achieves ATE estimation accuracy close to non-private baselines while enabling robust cross-source result integration.
📝 Abstract
Estimating causal effects from observational data is essential in fields such as medicine, economics and social sciences, where privacy concerns are paramount. We propose a general, model-agnostic framework for differentially private estimation of average treatment effects (ATE) that avoids strong structural assumptions on the data-generating process or the models used to estimate propensity scores and conditional outcomes. In contrast to prior work, which enforces differential privacy by directly privatizing these nuisance components and results in a privacy cost that scales with model complexity, our approach decouples nuisance estimation from privacy protection. This separation allows the use of flexible, state-of-the-art black-box models, while differential privacy is achieved by perturbing only predictions and aggregation steps within a fold-splitting scheme with ensemble techniques. We instantiate the framework for three classical estimators -- the G-formula, inverse propensity weighting (IPW), and augmented IPW (AIPW) -- and provide formal utility and privacy guarantees. Empirical results show that our methods maintain competitive performance under realistic privacy budgets. We further extend our framework to support meta-analysis of multiple private ATE estimates. Our results bridge a critical gap between causal inference and privacy-preserving data analysis.