Goto

Collaborating Authors

Lightning Fast XGBoost on Multiple GPUs

#artificialintelligence

XGBoost is one of the most used libraries fora data science. At the time XGBoost came into existence, it was lightning fast compared to its nearest rival Python's Scikit-learn GBM. But as the times have progressed, it has been rivaled by some awesome libraries like LightGBM and Catboost, both on speed as well as accuracy. I, for one, use LightGBM for most of the use cases where I have just got CPU for training. But when I have a GPU or multiple GPUs at my disposal, I still love to train with XGBoost.


How to use XGBoost algorithm in R in easy steps

#artificialintelligence

Did you know using XGBoost algorithm is one of the popular winning recipe of data science competitions? So, what makes it more powerful than a traditional Random Forest or Neural Network? In the last few years, predictive modeling has become much faster and accurate. I remember spending long hours on feature engineering for improving model by few decimals. A lot of that difficult work, can now be done by using better algorithms.


How to use XGBoost algorithm in R in easy steps

#artificialintelligence

Did you know using XGBoost algorithm is one of the popular winning recipe of data science competitions? So, what makes it more powerful than a traditional Random Forest or Neural Network? In the last few years, predictive modeling has become much faster and accurate. I remember spending long hours on feature engineering for improving model by few decimals. A lot of that difficult work, can now be done by using better algorithms.


Fine-tuning XGBoost in Python like a boss – Towards Data Science

#artificialintelligence

XGBoost (or eXteme Gradient Boosting) is not to be introduced anymore, proved relevant in only too many data science competitions, is still one model that is tricky to fine-tune if you have only been starting playing with it. Because if you have big datasets, and you run a naive grid search on 5 different parameters and having for each of them 5 possible values, then you'll have 5⁵ 3,125 iterations to go. If one iteration takes 10 minutes to run, you'll have more than 21 days to wait before getting your parameters (I don't talk about Python crashing, without letting you know, and you waiting too long before realizing it). I suppose here that you made correctly your job of feature engineering first. Specifically with categorical features, since XGBoost does not take categorical features in input.


Forecasting with time series imaging

arXiv.org Machine Learning

Feature-based time series representation has attracted substantial attention in a wide range of time series analysis methods. Recently, the use of time series features for forecast model selection and model averaging has been an emerging research focus in the forecasting community. Nonetheless, most of the existing approaches depend on the manual choice of an appropriate set of features. Exploiting machine learning methods to automatically extract features from time series becomes crucially important in the state-of-the-art time series analysis. In this paper, we introduce an automated approach to extract time series features based on images. Time series are first transformed into recurrence images, from which local features can be extracted using computer vision algorithms. The extracted features are used for forecast model selection and model averaging. Our experiments show that forecasting based on automatically extracted features, with less human intervention and a more comprehensive view of the raw time series data, yields comparable performances with the top best methods proposed in the largest forecasting competition M4.