betaboosting
BetaBoosting
At this point, we all know of XGBoost due to the massive success it has had in numerous Data Science competitions held on platforms like Kaggle. Along with its success, we have seen several variations such as CatBoost and LightGBM. All of these implementations are based on the Gradient Boosting algorithm developed by Friedman¹, which involves iteratively building an ensemble of weak learners (usually decision trees) where each subsequent learner is trained on the previous learner's errors. Let's take a look at some general pseudo-code for the algorithm from Elements of Statistical Learning²: However, this is not complete! A core mechanism which allows boosting to work is a shrinkage parameter that penalizes each learner at each boosting round that is commonly called the'learning rate'.