🤖 AI Summary
Existing causal effect estimation methods typically rely on intervention data, ground-truth causal graphs, or strong assumptions—such as no unmeasured confounding—rendering them inapplicable to real-world scenarios lacking prior causal structure or intervention records. This work introduces, for the first time, Prior-data Fitted Networks (PFNs) combined with in-context learning into causal inference, proposing an end-to-end, assumption-light framework: PFNs are pre-trained on synthetic causal data and then directly predict counterfactual outcomes from observational data via in-context learning alone. The approach eliminates dependence on causal graph specification, intervention data, or the no-unmeasured-confounding assumption. Its generalizability is validated through multi-structure intervention modeling and ablation-driven robustness analysis. Experiments demonstrate high-accuracy causal effect estimation across diverse synthetic causal tasks, with consistent robustness to varying causal structures, sample sizes, and confounding strengths.
📝 Abstract
Estimation of causal effects is critical to a range of scientific disciplines. Existing methods for this task either require interventional data, knowledge about the ground truth causal graph, or rely on assumptions such as unconfoundedness, restricting their applicability in real-world settings. In the domain of tabular machine learning, Prior-data fitted networks (PFNs) have achieved state-of-the-art predictive performance, having been pre-trained on synthetic data to solve tabular prediction problems via in-context learning. To assess whether this can be transferred to the harder problem of causal effect estimation, we pre-train PFNs on synthetic data drawn from a wide variety of causal structures, including interventions, to predict interventional outcomes given observational data. Through extensive experiments on synthetic case studies, we show that our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph. We also perform ablation studies that elucidate Do-PFN's scalability and robustness across datasets with a variety of causal characteristics.