To tune or not to tune the number of trees in random forest?

May-16-2017–arXiv.org Machine Learning

The number of trees T in the random forest (RF) algorithm for supervised learning has to be set by the user. It is controversial whether T should simply be set to the largest computationally manageable value or whether a smaller T may in some cases be better. While the principle underlying bagging is that "more trees are better", in practice the classification error rate sometimes reaches a minimum before increasing again for increasing number of trees. The goal of this paper is four-fold: (i) providing theoretical results showing that the expected error rate may be a non-monotonous function of the number of trees and explaining under which circumstances this happens; (ii) providing theoretical results showing that such non-monotonous patterns cannot be observed for other performance measures such as the Brier score and the logarithmic loss (for classification) and the mean squared error (for regression); (iii) illustrating the extent of the problem through an application to a large number (n = 306) of datasets from the public database OpenML; (iv) finally arguing in favor of setting it to a computationally feasible large number, depending on convergence properties of the desired performance measure.

artificial intelligence, decision tree learning, machine learning, (19 more...)

arXiv.org Machine Learning

May-16-2017

arXiv.org PDF

Add feedback

Country:
- Asia > India (0.04)
- North America > United States
  - California > Monterey County > Pacific Grove (0.04)
- Europe
  - Germany > North Rhine-Westphalia
    - Upper Bavaria > Munich (0.04)
  - Albania > Durrës County
    - Durrës (0.04)

Genre:
- Research Report > New Finding (0.86)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Decision Tree Learning (1.00)
  - Ensemble Learning (0.76)
  - Performance Analysis > Accuracy (0.61)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found