🤖 AI Summary
Existing causal discovery methods applied to observational data rely on strong functional assumptions—such as linearity or additive noise—to ensure structural identifiability; however, these assumptions often fail to hold in practice, undermining theoretical guarantees and empirical performance.
Method: We propose a novel multivariate causal discovery framework that operates without interventional data and relaxes functional assumptions. Our approach is the first to extend Bayesian model selection to continuous causal graph learning: it employs a hyperparameterized adjacency matrix, jointly optimizes the marginal likelihood and a differentiable acyclicity regularizer, and integrates a Causal Gaussian Process Conditional Density Estimator (CGP-CDE) for Bayesian nonparametric inference. Optimization proceeds via continuous relaxation, ensuring scalability and theoretical robustness.
Results: The method achieves significant improvements over state-of-the-art baselines on both synthetic and real-world benchmarks, with controllable error rates and strong generalization—breaking the traditional dependence of identifiability on either strong functional assumptions or interventional data.
📝 Abstract
Current causal discovery approaches require restrictive model assumptions or assume access to interventional data to ensure structure identifiability. These assumptions often do not hold in real-world applications leading to a loss of guarantees and poor accuracy in practice. Recent work has shown that, in the bivariate case, Bayesian model selection can greatly improve accuracy by exchanging restrictive modelling for more flexible assumptions, at the cost of a small probability of error. We extend the Bayesian model selection approach to the important multivariate setting by making the large discrete selection problem scalable through a continuous relaxation. We demonstrate how for our choice of Bayesian non-parametric model, the Causal Gaussian Process Conditional Density Estimator (CGP-CDE), an adjacency matrix can be constructed from the model hyperparameters. This adjacency matrix is then optimised using the marginal likelihood and an acyclicity regulariser, outputting the maximum a posteriori causal graph. We demonstrate the competitiveness of our approach on both synthetic and real-world datasets, showing it is possible to perform multivariate causal discovery without infeasible assumptions using Bayesian model selection.