Distributional Treatment Effect Estimation across Heterogeneous Sites via Optimal Transport

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of causal inference across heterogeneous sites when only control-group data—not treatment-group observations—are available at the target site. We propose a distributed causal inference framework grounded in optimal transport, modeling inter-site heterogeneity as pushforward mappings between probability distributions. Leveraging complete experimental data from source sites and control-group data from the target site, our method synthesizes the counterfactual treatment-group distribution at the target site. The approach integrates optimal transport theory, distribution alignment, and probabilistic measure mapping to enable holistic distributional transfer of treatment effects. We validate the method on multiple synthetic benchmarks and real-world patient-derived xenograft data, demonstrating accurate recovery of the full treatment-effect distribution at the target site. The estimator is statistically consistent and asymptotically convergent. Our work significantly extends synthetic-control methodologies to distribution-level causal inference, broadening their applicability in heterogeneous multi-site settings.

Technology Category

Application Category

📝 Abstract
We propose a novel framework for synthesizing counterfactual treatment group data in a target site by integrating full treatment and control group data from a source site with control group data from the target. Departing from conventional average treatment effect estimation, our approach adopts a distributional causal inference perspective by modeling treatment and control as distinct probability measures on the source and target sites. We formalize the cross-site heterogeneity (effect modification) as a push-forward transformation that maps the joint feature-outcome distribution from the source to the target site. This transformation is learned by aligning the control group distributions between sites using an Optimal Transport-based procedure, and subsequently applied to the source treatment group to generate the synthetic target treatment distribution. Under general regularity conditions, we establish theoretical guarantees for the consistency and asymptotic convergence of the synthetic treatment group data to the true target distribution. Simulation studies across multiple data-generating scenarios and a real-world application to patient-derived xenograft data demonstrate that our framework robustly recovers the full distributional properties of treatment effects.
Problem

Research questions and friction points this paper is trying to address.

Estimating distributional treatment effects across heterogeneous sites
Synthesizing target site treatment data using source site information
Addressing cross-site effect modification via optimal transport alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal Transport aligns control group distributions
Push-forward transformation maps source to target distributions
Synthetic treatment data recovers distributional treatment effects
🔎 Similar Papers
No similar papers found.
B
Borna Bateni
Department of Statistics & Data Science, University of California, Los Angeles, Los Angeles, CA, USA
Yubai Yuan
Yubai Yuan
The Penn State University
Network analysisactive learningcrowdsourcinglatent modelingcausal inference
Q
Qi Xu
Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
Annie Qu
Annie Qu
University of California Santa Barbara
Data integrationPrecision MedicineLLMMobile Health