Randomization as Regularization: A Degrees of Freedom Explanation for Random Forest Success

#artificialintelligence 

The sustained success random forests has led naturally to the desire to better understand the statistical and mathematical properties of the procedure. Lin and Jeon (2006) introduced the potential nearest neighbor framework and Biau and Devroye (2010) later established related consistency properties. In the last several years, a number of important statistical properties of random forests have also been established whenever base learners are constructed with subsamples rather than bootstrap samples. Scornet et al. (2015) provided the first consistency result for Breiman's original random forest algorithm whenever the true underlying regression function is assumed to be additive. Despite the impressive volume of research from the past two decades and the exciting recent progress in establishing their statistical properties, a satisfying explanation for the sustained empirical success of random forests has yet to be provided.