GS-BART: Bayesian Additive Regression Trees with Graph-split Decision Rules

📅 2025-09-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited topological awareness of conventional axis-aligned BART models in graph-structured data modeling, this paper proposes Graph-BART. The method embeds graph topology into the splitting criterion, introducing—within the BART framework—the first graph-structure-aware recursive partitioning mechanism. It further designs an efficient recursive computation strategy grounded in root-directed spanning trees. Combining a generalized nonparametric regression formulation with gradient-guided MCMC sampling, Graph-BART significantly improves computational scalability. Experiments demonstrate that Graph-BART consistently outperforms standard BART, state-of-the-art ensemble tree methods, and Gaussian process regression in prediction accuracy and generalization across diverse spatial and network regression and classification tasks.

Technology Category

Application Category

📝 Abstract
Ensemble decision tree methods such as XGBoost, Random Forest, and Bayesian Additive Regression Trees (BART) have gained enormous popularity in data science for their superior performance in machine learning regression and classification tasks. In this paper, we introduce a new Bayesian graph-split additive decision tree method, GS-BART, designed to enhance the performance of axis-parallel split-based BART for dependent data with graph structures. The proposed approach encodes input feature information into candidate graph sets and employs a flexible split rule that respects the graph topology when constructing decision trees. We consider a generalized nonparametric regression model using GS-BART and design a scalable informed MCMC algorithm to sample the decision trees of GS-BART. The algorithm leverages a gradient-based recursive algorithm on root directed spanning trees or chains. The superior performance of the method over conventional ensemble tree models and Gaussian process regression models is illustrated in various regression and classification tasks for spatial and network data analysis.
Problem

Research questions and friction points this paper is trying to address.

Enhancing BART for dependent data with graph structures
Developing flexible split rules respecting graph topology
Improving performance on spatial and network data analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-split decision rules for tree construction
Scalable informed MCMC sampling algorithm
Gradient-based recursive algorithm on spanning trees
🔎 Similar Papers
No similar papers found.
S
Shuren He
Department of Statistics, Texas A&M University, College Station
Huiyan Sang
Huiyan Sang
Texas A&M University
Q
Quan Zhou
Department of Statistics, Texas A&M University, College Station