๐ค AI Summary
This work addresses the problem of distributional regression, where the goal is to predict a scalar response from distributional covariates, with the response depending on population-level rather than individual-level features. The authors propose DistBART, a novel method that models the regression functional as a linear functional whose Riesz representer is assigned a Bayesian Additive Regression Trees (BART) priorโmarking the first integration of BART into the distributional regression framework. By establishing a theoretical connection to kernel methods and introducing a variant capable of learning nonlinear functionals, DistBART adaptively captures low-dimensional marginal structures inherent in the input distributions. Leveraging random feature approximations, inference is recast as sparse Bayesian linear regression, achieving a favorable balance among computational efficiency, uncertainty quantification, and posterior convergence, as demonstrated on both synthetic and real-world datasets.
๐ Abstract
Distribution regression, where the goal is to predict a scalar response from a distribution-valued predictor, arises naturally in settings where observations are grouped and outcomes depend on group-level characteristics rather than on individual measurements. We introduce DistBART, a Bayesian nonparametric approach to distribution regression that models the regression function as a linear functional with the Riesz representer assigned a Bayesian additive regression trees (BART) prior. We argue that shallow decision tree ensembles encode reasonable inductive biases for tabular data, making them appropriate in settings where the functional depends primarily on low-dimensional marginals of the distributions. We show this both empirically on synthetic and real data and theoretically through an adaptive posterior concentration result. We also establish connections to kernel methods, and use this connection to motivate variants of DistBART that can learn nonlinear functionals. To enable scalability to large datasets, we develop a random-feature approximation that samples trees from the BART prior and reduces inference to sparse Bayesian linear regression, achieving computational efficiency while retaining uncertainty quantification.