Model Agnostic Differentially Private Causal Inference

📅 2025-05-26
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Estimating the average treatment effect (ATE) under differential privacy from privacy-sensitive observational data—e.g., healthcare or economic records—faces challenges including strong modeling assumptions, privacy cost scaling with estimator complexity, and limited methodological flexibility. This paper proposes the first general framework that decouples nuisance parameter estimation from privacy protection via folded-splitting and ensemble prediction perturbation. Without assuming a specific data-generating mechanism or parametric model form, it delivers unified differential privacy guarantees for diverse ATE estimators—including G-formula, inverse probability weighting (IPW), and augmented IPW (AIPW). The framework further extends to differentially private meta-analysis of ATEs across multiple private data sources. We establish rigorous differential privacy and statistical efficiency guarantees. Empirical evaluation demonstrates that, under realistic privacy budgets, our method achieves ATE estimation accuracy close to non-private baselines while enabling robust cross-source result integration.

Technology Category

Application Category

📝 Abstract
Estimating causal effects from observational data is essential in fields such as medicine, economics and social sciences, where privacy concerns are paramount. We propose a general, model-agnostic framework for differentially private estimation of average treatment effects (ATE) that avoids strong structural assumptions on the data-generating process or the models used to estimate propensity scores and conditional outcomes. In contrast to prior work, which enforces differential privacy by directly privatizing these nuisance components and results in a privacy cost that scales with model complexity, our approach decouples nuisance estimation from privacy protection. This separation allows the use of flexible, state-of-the-art black-box models, while differential privacy is achieved by perturbing only predictions and aggregation steps within a fold-splitting scheme with ensemble techniques. We instantiate the framework for three classical estimators -- the G-formula, inverse propensity weighting (IPW), and augmented IPW (AIPW) -- and provide formal utility and privacy guarantees. Empirical results show that our methods maintain competitive performance under realistic privacy budgets. We further extend our framework to support meta-analysis of multiple private ATE estimates. Our results bridge a critical gap between causal inference and privacy-preserving data analysis.
Problem

Research questions and friction points this paper is trying to address.

Estimating causal effects privately from observational data
Avoiding strong assumptions on data-generating processes
Decoupling nuisance estimation from privacy protection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-agnostic differentially private causal inference
Decouples nuisance estimation from privacy protection
Perturbs predictions and aggregation with fold-splitting
🔎 Similar Papers
No similar papers found.
C
C. Lebeda
Inria, Université de Montpellier, INSERM, France
Mathieu Even
Mathieu Even
Inria Montpellier
probabilitiesstatisticsoptimizationmachine learning
A
A. Bellet
Inria, Université de Montpellier, INSERM, France
Julie Josse
Julie Josse
Senior Researcher Inria,
Missing valuesLow rank matrixcausal inferenceR