🤖 AI Summary
The PC algorithm suffers from high computational complexity, limiting its scalability to large-scale causal discovery tasks. To address this, we propose Tsetlin-Machine-Accelerated PC (TM-PC), a novel framework that leverages the Tsetlin Machine (TM) to efficiently identify a compact subset of candidate variables, thereby drastically reducing the search space for conditional independence tests. Crucially, TM-PC employs a binary, gradient-free feature importance evaluation mechanism—obviating reliance on parametric statistical assumptions and computationally expensive model fitting. Experiments on the bnlearn benchmark suite demonstrate that TM-PC achieves average runtime reductions of 42%–68% relative to standard PC, while preserving structural learning accuracy—F1-score differences remain below 0.03. This work constitutes the first integration of the Tsetlin Machine into causal structure learning, establishing a new paradigm for scalable, interpretable Bayesian network construction.
📝 Abstract
The PC algorithm is a widely used method in causal inference for learning the structure of Bayesian networks. Despite its popularity, the PC algorithm suffers from significant time complexity, particularly as the size of the dataset increases, which limits its applicability in large-scale real-world problems. In this study, we propose a novel approach that utilises the Tsetlin Machine (TM) to construct Bayesian structures more efficiently. Our method leverages the most significant literals extracted from the TM and performs conditional independence (CI) tests on these selected literals instead of the full set of variables, resulting in a considerable reduction in computational time. We implemented our approach and compared it with various state-of-the-art methods. Our evaluation includes categorical datasets from the bnlearn repository, such as Munin1, Hepar2. The findings indicate that the proposed TM-based method not only reduces computational complexity but also maintains competitive accuracy in causal discovery, making it a viable alternative to traditional PC algorithm implementations by offering improved efficiency without compromising performance.