A note on the consistency of the random forest algorithm
Nowadays, the algorithm is acknowledged to be easy to use and to perform very well in general, even in problems involving many predictor variables (see for instance Biau and Scornet (2016) or the introduction to Scornet, Biau and Vert (2015)) ― so well, indeed, that several authors have posed and studied the question of their consistency (see Scornet, Biau and Vert (2015) and the earlier references provided by them). Consistent nonparametric statistical predictors have been known for a long time (e.g. Nadaraya (1964), Watson (1964), Stone (1977), Devroye and Wagner (1980)), but they converge very slowly and their computer implementations tend to be slow, especially when they involve many variables. In view of their comparative accuracy and high speed of implementation, random forests would become even more attractive if they were shown to be consistent under general data ‐ generating mechanisms. Besides, consistency is almost indispensable in applications of statistical prediction to the estimation of'causal effects' based on observational data (e.g.
Oct-14-2019
- Country:
- Europe > Netherlands (0.04)
- North America > United States
- New York (0.04)
- Genre:
- Research Report (0.50)
- Technology: