Very fast Bayesian Additive Regression Trees on GPU

Petrillo, Giacomo

arXiv.org Machine Learning 

BART Bayesian Additive Regression Trees (BART) is a nonparametric Bayesian regression method, introduced by Chipman, George, and McCulloch (2006, 2010). It defines a prior distribution over the space of functions by representing them as a sum of binary decision trees, and then specifying a stochastic tree generation process. The posterior is then obtained with Metropolis-Gibbs sampling over the trees. See Hill, Linero, and Murray (2020) for a review, and Daniels, Linero, and Roy (2023, ch. 5) for a textbook treatment. BART's success BART has proven empirically effective, and is gaining popularity (consider, e.g., Tan and Roy 2019). The Atlantic Causal Inference Conference (ACIC) Data Challenge has confirmed BART as one of the best regression methods for causal inference (Dorie et al. 2019; Gruber et al. 2019; Hahn, Dorie, and Murray 2019; Thal and Finucane 2023). Many BART variants have been developed throughout the years, adding features such as variable selection (Linero 2018).