Collaborating Authors

Ensemble Machine Learning Algorithms in Python with scikit-learn - Machine Learning Mastery


Ensembles can give you a boost in accuracy on your dataset. In this post you will discover how you can create some of the most powerful types of ensembles in Python using scikit-learn. This case study will step you through Boosting, Bagging and Majority Voting and show you how you can continue to ratchet up the accuracy of the models on your own datasets. Ensemble Machine Learning Algorithms in Python with scikit-learn Photo by The United States Army Band, some rights reserved. It assumes you are generally familiar with machine learning algorithms and ensemble methods and that you are looking for information on how to create ensembles in Python.

A Comprehensive Guide to Ensemble Learning - What Exactly Do You Need to Know -


Ensemble learning techniques have been proven to yield better performance on machine learning problems. We can use these techniques for regression as well as classification problems. The final prediction from these ensembling techniques is obtained by combining results from several base models. Averaging, voting and stacking are some of the ways the results are combined to obtain a final prediction. In this article, we will explore how ensemble learning can be used to come up with optimal machine learning models. Ensemble learning is a combination of several machine learning models in one problem.

Stacking Ensemble Machine Learning With Python


Stacking or Stacked Generalization is an ensemble machine learning algorithm. It uses a meta-learning algorithm to learn how to best combine the predictions from two or more base machine learning algorithms. The benefit of stacking is that it can harness the capabilities of a range of well-performing models on a classification or regression task and make predictions that have better performance than any single model in the ensemble. In this tutorial, you will discover the stacked generalization ensemble or stacking in Python. Stacking Ensemble Machine Learning With Python Photo by lamoix, some rights reserved.

Bagging and Random Forest for Imbalanced Classification


Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions from all models. Random forest is an extension of bagging that also randomly selects subsets of features used in each data sample. Both bagging and random forests have proven effective on a wide range of different predictive modeling problems. Although effective, they are not suited to classification problems with a skewed class distribution. Nevertheless, many modifications to the algorithms have been proposed that adapt their behavior and make them better suited to a severe class imbalance. In this tutorial, you will discover how to use bagging and random forest for imbalanced classification.

How to Develop a Gradient Boosting Machine Ensemble in Python


The Gradient Boosting Machine is a powerful ensemble machine learning algorithm that uses decision trees. Boosting is a general ensemble technique that involves sequentially adding models to the ensemble where subsequent models correct the performance of prior models. AdaBoost was the first algorithm to deliver on the promise of boosting. Gradient boosting is a generalization of AdaBoosting, improving the performance of the approach and introducing ideas from bootstrap aggregation to further improve the models, such as randomly sampling the samples and features when fitting ensemble members. Gradient boosting performs well, if not the best, on a wide range of tabular datasets, and versions of the algorithm like XGBoost and LightBoost often play an important role in winning machine learning competitions.