Tree-Structured Boosting: Connections Between Gradient Boosted Stumps and Full Decision Trees
Luna, José Marcio, Eaton, Eric, Ungar, Lyle H., Diffenderfer, Eric, Jensen, Shane T., Gennatas, Efstathios D., Wirth, Mateo, Simone, Charles B. II, Solberg, Timothy D., Valdes, Gilmer
Classification And Regression Tree (CART) analysis Breiman et al. [1984] is a well-established statistical learning technique, which has been adopted by numerous other fields for its model interpretability, scalability to large data sets, and connection to rule-based decision making Loh [2014]. CART builds a model by recursively partitioning the instance space, labeling each partition with either a predicted category (in the case of classification) or real-value (in the case of regression). Despite their widespread use, CART models often have lower predictive performance than other statistical learning models, such as kernel methods and ensemble techniques Caruana and Niculescu-Mizil [2006]. Among the latter, boosting methods were developed as a means to train an ensemble of weak learners (often CART models) iteratively into a high-performance predictive model, albeit with a loss of model interpretability. In particular, gradient boosting methods Friedman [2001] focus on iteratively optimizing an ensemble's prediction to increasingly match the labeled training data. Historically these two categories of approaches, CART and gradient boosting, have been studied separately, connected primarily through CART models being used as the weak learners in boosting. This paper investigates a deeper and surprising connection between full interaction models like CART and additive models like gradient boosting, showing that the resulting models exist upon a spectrum. In particular, this paper includes the following contributions: - We introduce tree-structured boosting (TSB) as a new mechanism for creating a hierarchical ensemble model that recursively partitions the instance space, forming a perfect binary tree of weak learners. Each path from the root node to a leaf represents the outcome of a gradient boosted stumps (GBS) ensemble for a particular partition of the instance space.
Nov-17-2017
- Country:
- North America > United States
- Pennsylvania (0.05)
- Wisconsin (0.04)
- Maryland (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Monterey County > Monterey (0.04)
- Los Angeles County > Long Beach (0.04)
- North America > United States
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine > Therapeutic Area (0.71)
- Technology: