Lassoed Forests: Random Forests with Adaptive Lasso Post-selection

Shang, Jing, Bannon, James, Haibe-Kains, Benjamin, Tibshirani, Robert

arXiv.org Machine Learning 

Tree-based methods are a family of non-parametric approaches in supervised learning. Random forests use a form of bootstrap aggregation, or bagging, to combine a large collection of trees and produce a final prediction. In regression problems, it gives the same weight to each tree and computes the average out-of-bag prediction. In classification problems, it assigns class labels by majority vote. However, since a single-tree model is known to have high variance, a large number of trees need to be trained and aggregated in order to reduce variance (Hastie et al. 2009). This can lead to redundant trees, as the bootstrap procedure may select similar sets of samples to train different trees. Moreover, increasing the number of trees does not reduce the bias. Post-selection boosting random forests, proposed by Wang & Wang (2021), is an attempt to reduce bias by applying Lasso regression (Tibshirani 1996) on the predictions from each individual tree. The method returns a sparser forest with fewer trees, as well as different weights assigned to each individual tree.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found