Scalable Bayesian Network Structure Learning Using Tsetlin Machine to Constrain the Search Space

📅 2025-11-24

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

The PC algorithm suffers from high computational complexity, limiting its scalability to large-scale causal discovery tasks. To address this, we propose Tsetlin-Machine-Accelerated PC (TM-PC), a novel framework that leverages the Tsetlin Machine (TM) to efficiently identify a compact subset of candidate variables, thereby drastically reducing the search space for conditional independence tests. Crucially, TM-PC employs a binary, gradient-free feature importance evaluation mechanism—obviating reliance on parametric statistical assumptions and computationally expensive model fitting. Experiments on the bnlearn benchmark suite demonstrate that TM-PC achieves average runtime reductions of 42%–68% relative to standard PC, while preserving structural learning accuracy—F1-score differences remain below 0.03. This work constitutes the first integration of the Tsetlin Machine into causal structure learning, establishing a new paradigm for scalable, interpretable Bayesian network construction.

Technology Category

Application Category

📝 Abstract

The PC algorithm is a widely used method in causal inference for learning the structure of Bayesian networks. Despite its popularity, the PC algorithm suffers from significant time complexity, particularly as the size of the dataset increases, which limits its applicability in large-scale real-world problems. In this study, we propose a novel approach that utilises the Tsetlin Machine (TM) to construct Bayesian structures more efficiently. Our method leverages the most significant literals extracted from the TM and performs conditional independence (CI) tests on these selected literals instead of the full set of variables, resulting in a considerable reduction in computational time. We implemented our approach and compared it with various state-of-the-art methods. Our evaluation includes categorical datasets from the bnlearn repository, such as Munin1, Hepar2. The findings indicate that the proposed TM-based method not only reduces computational complexity but also maintains competitive accuracy in causal discovery, making it a viable alternative to traditional PC algorithm implementations by offering improved efficiency without compromising performance.

Problem

Research questions and friction points this paper is trying to address.

Reducing high time complexity in Bayesian network structure learning

Improving scalability of PC algorithm for large datasets

Maintaining causal discovery accuracy while enhancing computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Tsetlin Machine to constrain Bayesian network search

Performs conditional independence tests on selected literals

Reduces computational complexity while maintaining competitive accuracy

🔎 Similar Papers

No similar papers found.

Genentech

New York City, New York, United States of America / South San Francisco, California, United States of America

(Senior) ML Scientist

Insitro

$183,000 - $238,000

South San Francisco, CA, USA

Machine Learning Engineer