How to use XGBoost algorithm in R in easy steps

#artificialintelligence

Did you know using XGBoost algorithm is one of the popular winning recipe of data science competitions? So, what makes it more powerful than a traditional Random Forest or Neural Network? In the last few years, predictive modeling has become much faster and accurate. I remember spending long hours on feature engineering for improving model by few decimals. A lot of that difficult work, can now be done by using better algorithms.


dmlc/xgboost

#artificialintelligence

This page contains a curated list of examples, tutorials, blogs about XGBoost usecases. It is inspired by awesome-MXNet, awesome-php and awesome-machine-learning. Please send a pull request if you find things that belongs to here. This is a list of short codes introducing different functionalities of xgboost packages. Most of examples in this section are based on CLI or python version.


Fine-tuning XGBoost in Python like a boss – Towards Data Science

#artificialintelligence

XGBoost (or eXteme Gradient Boosting) is not to be introduced anymore, proved relevant in only too many data science competitions, is still one model that is tricky to fine-tune if you have only been starting playing with it. Because if you have big datasets, and you run a naive grid search on 5 different parameters and having for each of them 5 possible values, then you'll have 5⁵ 3,125 iterations to go. If one iteration takes 10 minutes to run, you'll have more than 21 days to wait before getting your parameters (I don't talk about Python crashing, without letting you know, and you waiting too long before realizing it). I suppose here that you made correctly your job of feature engineering first. Specifically with categorical features, since XGBoost does not take categorical features in input.


How to use XGBoost algorithm in R in easy steps

#artificialintelligence

Did you know using XGBoost algorithm is one of the popular winning recipe of data science competitions? So, what makes it more powerful than a traditional Random Forest or Neural Network? In the last few years, predictive modeling has become much faster and accurate. I remember spending long hours on feature engineering for improving model by few decimals. A lot of that difficult work, can now be done by using better algorithms.


Why isn't XGBoost a more popular research topic? • /r/MachineLearning

@machinelearnbot

These are well understood models. They are gradient boosted trees with some optimizations. The theory has been around for decades, so there is not much to be uncovered, the library is actually application of established research. Also, there is a lull in new research ideas related to trees. Perhaps you can come up with some new techniques to improve trees further, and welcome in a new era of tree-based model research?