Statistical Advantages of Oblique Randomized Decision Trees and Forests

📅 2024-07-02
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses regression for ridge functions (multi-index models), where conventional axis-aligned random trees suffer from statistical suboptimality under low-dimensional structure. We propose oblique Mondrian trees and forests that split via linear combinations of features. Theoretically, we establish the first generalization error bound for oblique Mondrian forests, achieving minimax-optimal convergence rates with respect to the intrinsic dimension of the relevant subspace. We rigorously prove the inherent suboptimality of axis-aligned Mondrian estimators in this model class and derive a corresponding risk lower bound. Moreover, we quantify the robustness of oblique methods to estimation errors in the underlying feature directions. Both theory and experiments demonstrate that oblique splitting substantially reduces estimation risk—particularly when covariates are high-dimensional yet the response depends only on a low-dimensional linear projection—thereby yielding fundamental advantages in such structured settings.

Technology Category

Application Category

📝 Abstract
This work studies the statistical advantages of using features comprised of general linear combinations of covariates to partition the data in randomized decision tree and forest regression algorithms. Using random tessellation theory in stochastic geometry, we provide a theoretical analysis of a class of efficiently generated random tree and forest estimators that allow for oblique splits along such features. We call these estimators oblique Mondrian trees and forests, as the trees are generated by first selecting a set of features from linear combinations of the covariates and then running a Mondrian process that hierarchically partitions the data along these features. Generalization error bounds and convergence rates are obtained for the flexible dimension reduction model class of ridge functions (also known as multi-index models), where the output is assumed to depend on a low dimensional relevant feature subspace of the input domain. The results highlight how the risk of these estimators depends on the choice of features and quantify how robust the risk is with respect to error in the estimation of relevant features. The asymptotic analysis also provides conditions on the selected features along which the data is split for these estimators to obtain minimax optimal rates of convergence with respect to the dimension of the relevant feature subspace. Additionally, a lower bound on the risk of axis-aligned Mondrian trees (where features are restricted to the set of covariates) is obtained proving that these estimators are suboptimal for these linear dimension reduction models in general, no matter how the distribution over the covariates used to divide the data at each tree node is weighted.
Problem

Research questions and friction points this paper is trying to address.

Analyzing statistical performance of oblique decision trees using linear covariate combinations
Establishing generalization bounds for multi-index models with dimension reduction
Proving axis-aligned trees are suboptimal for general ridge functions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Oblique Mondrian trees use linear combinations of covariates
Random tessellation theory analyzes generalization error bounds
Consistent feature estimation enables minimax optimal convergence rates
🔎 Similar Papers
No similar papers found.