π€ AI Summary
Traditional Bayesian causal inference requires intractable posterior marginalization over the full causal modelβi.e., both the causal ordering and the DAG structure. This work proposes a hierarchical marginalization framework that decouples ordering uncertainty from structural uncertainty: first, an autoregressive causal ordering (ARCO) distribution models the uncertainty over topological orderings; second, given a fixed ordering, the DAG structure is analytically marginalized via constrained parent-set enumeration, avoiding joint combinatorial search. The result is a two-level approximate Bayesian model averaging over both ordering and structure. The method integrates Gaussian process functional mechanisms, bounded parent-set enumeration, and MCMC sampling. On nonlinear additive-noise benchmarks, it significantly outperforms state-of-the-art causal structure learning methods; on real-world data, it achieves competitive accuracy in estimating interventional distributions and average causal effects. To our knowledge, this is the first approach to systematically address the scalability bottleneck of posterior marginalization over causal structures.
π Abstract
The traditional two-stage approach to causal inference first identifies a single causal model (or equivalence class of models), which is then used to answer causal queries. However, this neglects any epistemic model uncertainty. In contrast, Bayesian causal inference does incorporate epistemic uncertainty into query estimates via Bayesian marginalisation (posterior averaging) over all causal models. While principled, this marginalisation over entire causal models, i.e., both causal structures (graphs) and mechanisms, poses a tremendous computational challenge. In this work, we address this challenge by decomposing structure marginalisation into the marginalisation over (i) causal orders and (ii) directed acyclic graphs (DAGs) given an order. We can marginalise the latter in closed form by limiting the number of parents per variable and utilising Gaussian processes to model mechanisms. To marginalise over orders, we use a sampling-based approximation, for which we devise a novel auto-regressive distribution over causal orders (ARCO). Our method outperforms state-of-the-art in structure learning on simulated non-linear additive noise benchmarks, and yields competitive results on real-world data. Furthermore, we can accurately infer interventional distributions and average causal effects.