Forest Guided Smoothing

Verdinelli, Isabella, Wasserman, Larry

arXiv.org Machine Learning 

Random forests are often an accurate method for nonparametric regression but they are notoriously difficult to interpret. Also, it is difficult to construct standard errors, confidence intervals and meaningful measures of variable importance. In this paper, we construct a spatially adaptive local linear smoother that approximates the forest. Our approach builds on the ideas in Bloniarz et al. (2016) and Friedberg et al. (2020). The main difference is that we define a one parameter family of bandwidth matrices which help with the construction of confidence intervals, and measures of variable importance. Our starting point is the well-known fact that a random forest can be regarded as a type of kernel smoother (Breiman (2000); Scornet (2016); Lin and Jeon (2006); Geurts et al. (2006); Hothorn et al. (2004); Meinshausen (2006)). We take it as a given that the forest is an accurate predictor and we do not make any attempt to improve the method. Instead, we want to find a family of linear smoothers that approximate the forest. Then we show how to use this family for interpretation, bias correction, confidence intervals, variable importance and for exploring the structure of the forest.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found