SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules

📅 2025-01-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address BART’s limitations in modeling spatial dependencies, non-uniform feature effects, and complex decision boundaries, this paper proposes SoftBART—a soft-split Bayesian additive regression tree model. Methodologically, it introduces adaptive soft semi-multivariate splitting rules: instead of hard threshold-based partitions, splits are probabilistic, jointly leverage multiple features, and dynamically modulate local smoothness to achieve geometry-aware boundary estimation. Full Bayesian posterior inference is performed via MCMC, enabling principled uncertainty quantification. Experiments on synthetic benchmarks and real-world New York City education data demonstrate that SoftBART significantly outperforms BART and other state-of-the-art tree-based models in predictive accuracy, generalization, and interpretability—while retaining the structural transparency and additive flexibility of Bayesian tree ensembles.

Technology Category

Application Category

📝 Abstract
Bayesian Additive Regression Trees [BART, Chipman et al., 2010] have gained significant popularity due to their remarkable predictive performance and ability to quantify uncertainty. However, standard decision tree models rely on recursive data splits at each decision node, using deterministic decision rules based on a single univariate feature. This approach limits their ability to effectively capture complex decision boundaries, particularly in scenarios involving multiple features, such as spatial domains, or when transitions are either sharp or smoothly varying. In this paper, we introduce a novel probabilistic additive decision tree model that employs a soft split rule. This method enables highly flexible splits that leverage both univariate and multivariate features, while also respecting the geometric properties of the feature domain. Notably, the probabilistic split rule adapts dynamically across decision nodes, allowing the model to account for varying levels of smoothness in the regression function. We demonstrate the utility of the proposed model through comparisons with existing tree-based models on synthetic datasets and a New York City education dataset.
Problem

Research questions and friction points this paper is trying to address.

BART
complex data patterns
prediction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

SBAMDT
Adaptive Soft Splitting Rules
Feature Variability Handling
🔎 Similar Papers
No similar papers found.
S
Stamatina Lamprinakou
Department of Statistics, Texas A&M University, College Station, USA
Huiyan Sang
Huiyan Sang
Texas A&M University
B
B. Konomi
Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH, USA
L
Ligang Lu
Shell International Exploration and Production Inc, Houston, TX, USA