🤖 AI Summary
Causal discovery infers causal structures among variables from observational data but faces challenges in scalability and modeling global dependencies. This paper proposes the first probabilistic causal graph learning framework based on graph neural networks (GNNs), formulating causal discovery as an end-to-end graph-structure prediction task. It jointly encodes node and edge attributes via GNNs and employs information-theoretic supervision—including mutual information and conditional entropy—to directly learn the probability distribution over the space of causal graphs. Crucially, our method is the first to explicitly model the global causal graph distribution using GNNs, enabling zero-shot cross-dataset generalization. Evaluated on multiple synthetic and real-world benchmarks, it significantly outperforms both classical and state-of-the-art non-GNN methods in accuracy and computational efficiency. This advances the practical deployment of large-scale causal structure learning.
📝 Abstract
Causal discovery from observational data is challenging, especially with large datasets and complex relationships. Traditional methods often struggle with scalability and capturing global structural information. To overcome these limitations, we introduce a novel graph neural network (GNN)-based probabilistic framework that learns a probability distribution over the entire space of causal graphs, unlike methods that output a single deterministic graph. Our framework leverages a GNN that encodes both node and edge attributes into a unified graph representation, enabling the model to learn complex causal structures directly from data. The GNN model is trained on a diverse set of synthetic datasets augmented with statistical and information-theoretic measures, such as mutual information and conditional entropy, capturing both local and global data properties. We frame causal discovery as a supervised learning problem, directly predicting the entire graph structure. Our approach demonstrates superior performance, outperforming both traditional and recent non-GNN-based methods, as well as a GNN-based approach, in terms of accuracy and scalability on synthetic and real-world datasets without further training. This probabilistic framework significantly improves causal structure learning, with broad implications for decision-making and scientific discovery across various fields.