Randomer Forests
Tomita, Tyler M., Browne, James, Shen, Cencheng, Priebe, Carey E., Burns, Randal, Maggioni, Mauro, Vogelstein, Joshua T.
Ensemble methods -- particularly those based on decision trees -- have recently demonstrated superior performance in a variety of machine learning settings. Specifically, Random Forest (RF) was found to outperform >100 other methods in several manuscripts, and gradient boosting trees have been a crucial component of several recent Kaggle competition victories. Building off these successes and recent advances in sparse learning and random matrix theory, we propose a novel ensemble tree method called "Randomer Forest" (RerF). The key intuition behind RerF is that we can use sparse linear combinations at each decision node rather than just one feature (as in RF) or all of them (as in Rotation Forests). RerF significantly outperforms other methods on a standard benchmark suite containing 105 problems with varying dimension, sample size, and number of classes. Moreover, we provide an implementation that scales as or more efficiently than other available packages. Via a combination of basic principles, theory, and extensive numerical experiments, we demonstrate why, when, and how RerF achieves its performance properties.
Mar-19-2018
- Country:
- North America > United States
- Maryland > Baltimore (0.14)
- Missouri > St. Louis County
- St. Louis (0.04)
- California
- Yolo County > Davis (0.04)
- Santa Cruz County > Santa Cruz (0.04)
- San Diego County > San Diego (0.04)
- North America > United States
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (0.92)
- Research Report
- Industry:
- Health & Medicine (0.68)
- Education (0.67)
- Government
- Technology:
- Information Technology > Artificial Intelligence > Machine Learning
- Statistical Learning (1.00)
- Ensemble Learning (0.87)
- Decision Tree Learning (0.69)
- Performance Analysis > Accuracy (0.30)
- Information Technology > Artificial Intelligence > Machine Learning