Ensemble Learning
Accelerated proximal boosting
Fouillen, Erwan, Boyer, Claire, Sangnier, Maxime
Gradient boosting is a prediction method that iteratively combines weak learners to produce a complex and accurate model. From an optimization point of view, the learning procedure of gradient boosting mimics a gradient descent on a functional variable. This paper proposes to build upon the proximal point algorithm when the empirical risk to minimize is not differentiable. In addition, the novel boosting approach, called accelerated proximal boosting, benefits from Nesterov's acceleration in the same way as gradient boosting [Biau et al., 2018]. Advantages of leveraging proximal methods for boosting are illustrated by numerical experiments on simulated and real-world data. In particular, we exhibit a favorable comparison over gradient boosting regarding convergence rate and prediction accuracy.
Ensemble Learning Applied to Classify GPS Trajectories of Birds into Male or Female
We describe our first-place solution to the Animal Behavior Challenge (ABC 2018) on predicting gender of bird from its GPS trajectory. The task consisted in predicting the gender of shearwater based on how they navigate themselves across a big ocean. The trajectories are collected from GPS loggers attached on shearwaters' body, and represented as a variable-length sequence of GPS points (latitude and longitude), and associated meta-information, such as the sun azimuth, the sun elevation, the daytime, the elapsed time on each GPS location after starting the trip, the local time (date is trimmed), and the indicator of the day starting the from the trip. We used ensemble of several variants of Gradient Boosting Classifier along with Gaussian Process Classifier and Support Vector Classifier after extensive feature engineering and we ranked first out of 74 registered teams. The variants of Gradient Boosting Classifier we tried are CatBoost (Developed by Yandex), LightGBM (Developed by Microsoft), XGBoost (Developed by Distributed Machine Learning Community). Our approach could easily be adapted to other applications in which the goal is to predict a classification output from a variable-length sequence.
Random Forests ยท UC Business Analytics R Programming Guide
Bagging (bootstrap aggregating) regression trees is a technique that can turn a single tree model with high variance and poor predictive power into a fairly accurate prediction function. Unfortunately, bagging regression trees typically suffers from tree correlation, which reduces the overall performance of the model. Random forests are a modification of bagging that builds a large collection of de-correlated trees and have become a very popular "out-of-the-box" learning algorithm that enjoys good predictive performance. This tutorial will cover the fundamentals of random forests. This tutorial serves as an introduction to the random forests.
Shedding Light on Black Box Machine Learning Algorithms: Development of an Axiomatic Framework to Assess the Quality of Methods that Explain Individual Predictions
From self-driving vehicles and back-flipping robots to virtual assistants who book our next appointment at the hair salon or at that restaurant for dinner - machine learning systems are becoming increasingly ubiquitous. The main reason for this is that these methods boast remarkable predictive capabilities. However, most of these models remain black boxes, meaning that it is very challenging for humans to follow and understand their intricate inner workings. Consequently, interpretability has suffered under this ever-increasing complexity of machine learning models. Especially with regards to new regulations, such as the General Data Protection Regulation (GDPR), the necessity for plausibility and verifiability of predictions made by these black boxes is indispensable. Driven by the needs of industry and practice, the research community has recognised this interpretability problem and focussed on developing a growing number of so-called explanation methods over the past few years. These methods explain individual predictions made by black box machine learning models and help to recover some of the lost interpretability. With the proliferation of these explanation methods, it is, however, often unclear, which explanation method offers a higher explanation quality, or is generally better-suited for the situation at hand. In this thesis, we thus propose an axiomatic framework, which allows comparing the quality of different explanation methods amongst each other. Through experimental validation, we find that the developed framework is useful to assess the explanation quality of different explanation methods and reach conclusions that are consistent with independent research.
Ensemble Machine Learning in Python: Random Forest, AdaBoost
In recent years, we've seen a resurgence in AI, or artificial intelligence, and machine learning. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts. Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning. Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.
Fine-tuning XGBoost in Python like a boss โ Towards Data Science
XGBoost (or eXteme Gradient Boosting) is not to be introduced anymore, proved relevant in only too many data science competitions, is still one model that is tricky to fine-tune if you have only been starting playing with it. Because if you have big datasets, and you run a naive grid search on 5 different parameters and having for each of them 5 possible values, then you'll have 5โต 3,125 iterations to go. If one iteration takes 10 minutes to run, you'll have more than 21 days to wait before getting your parameters (I don't talk about Python crashing, without letting you know, and you waiting too long before realizing it). I suppose here that you made correctly your job of feature engineering first. Specifically with categorical features, since XGBoost does not take categorical features in input.
Gradient and Newton Boosting for Classification and Regression
Boosting refers to a type of classification and regression algorithms that enjoy large popularity due to their excellent predictive accuracy on a wide range of datasets. The first boosting algorithms for classification, including the well known AdaBoost algorithm, were introduced by Schapire [1990], Freund and Schapire [1995], and Freund et al. [1996]. Later, several authors [Breiman, 1998, 1999, Friedman et al., 2000, Mason et al., 2000, Friedman, 2001] introduced the statistical view of boosting as a stagewise optimization approach. In particular, Friedman et al. [2000] first introduced boosting algorithms which iteratively optimize Bernoulli and multinomial likelihoods for binary and multiclass classification using Newton updates. Further, Friedman [2001] presented gradient descent based boosting algorithms for both regression and classification tasks with general loss functions.
Machine Learning Kaggle Competition Part Two: Improving
I recommend against the "lone genius" path, not only because it's exceedingly lonely, but also because you will miss out on the most important part of a Kaggle competition: learning from other data scientists. If you work by yourself, you end up relying on the same old methods while the rest of the world adopts more efficient and accurate techniques. As a concrete example, I recently have been dependent on the random forest model, automatically applying it to any supervised machine learning task. This competition finally made me realize that although the random forest is a decent starting model, everyone else has moved on to the superior gradient boosting machine. I also don't recommend the "copy and paste" approach, not because I'm against using other's code (with proper attribution), but because you are still limiting your chances to learn.
Kalman Filter-based Heuristic Ensemble: A New Perspective on Ensemble Classification Using Kalman Filters
Pakrashi, Arjun, Mac Namee, Brian
A classifier ensemble is a combination of multiple diverse classifier models whose outputs are aggregated into a single prediction. Ensembles have been repeatedly shown to perform better than single classifier models, therefore ensembles has been always a subject of research. The objective of this paper is to introduce a new perspective on ensemble classification by considering the training of the ensemble as a state estimation problem. The state is estimated using noisy measurements, and these measurements are then combined using a Kalman filter, within which heuristics are used. An implementation of this perspective, Kalman Filter based Heuristic Ensemble (KFHE), is also presented in this paper. Experiments performed on several datasets, indicate the effectiveness and the potential of KFHE when compared with boosting and bagging. Moreover, KFHE was found to perform comparatively better than bagging and boosting in the case of datasets with noisy class label assignments.
Tuning xgboost in R: Part I
Tuning a Boosting algorithm for the first time may be a very confusing task. There are so many parameters to choose and they all have different behaviour on the results. Also, the best choice may depends on the data. Every time I get a new dataset I learn something new. A good understanding of classification and regression trees (CART) is also helpful because we will be boosting trees, you can start here if you have no idea of what a CART is.