AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

LinXGBoost: Extension of XGBoost to Generalized Local Linear Models

de Vito, Laurent

arXiv.org Machine LearningOct-10-2017

XGBoost is often presented as the algorithm that wins every ML competition. Surprisingly, this is true even though predictions are piecewise constant. This might be justified in high dimensional input spaces, but when the number of features is low, a piecewise linear model is likely to perform better. XGBoost was extended into LinXGBoost that stores at each leaf a linear model. This extension, equivalent to piecewise regularized least-squares, is particularly attractive for regression of functions that exhibits jumps or discontinuities. Those functions are notoriously hard to regress. Our extension is compared to the vanilla XGBoost and Random Forest in experiments on both synthetic and real-world data sets.

artificial intelligence, machine learning, xgboost, (18 more...)

arXiv.org Machine Learning

1710.03634

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

XGBoost, a Top Machine Learning Method on Kaggle, Explained

#artificialintelligenceOct-6-2017, 04:15:27 GMT

XGBoost has become a widely used and really popular tool among Kaggle competitors and Data Scientists in industry, as it has been battle tested for production on large-scale problems. It is a highly flexible and versatile tool that can work through most regression, classification and ranking problems as well as user-built objective functions. As an open-source software, it is easily accessible and it may be used through different platforms and interfaces. The amazing portability and compatibility of the system permits its usage on all three Windows, Linux and OS X. It also supports training on distributed cloud platforms like AWS, Azure, GCE among others and it is easily connected to large-scale cloud dataflow systems such as Flink and Spark.

artificial intelligence, machine learning, xgboost, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

KNIME Analytics: a Review

#artificialintelligenceSep-26-2017, 10:10:23 GMT

This video shows a general review of the analytics capabilities of the KNIME Analytics Platform. Here we only mention: Random Forest, Deep Learning, Gradient Boosted Trees, Bagging and Boosting for ensemble methods, Decision Trees, Neural Networks, Logistic Regression, how to build your own ensemble model, and external integrations as Weka, H2O, R, and Python. This is what we show here, which for time reasons, is of course incomplete. Download and install KNIME Analytics Platform (https://www.knime.com/downloads) to explore the constantly growing set of machine learning and statistics algorithms available to analyze your data.

analytic, artificial intelligence, machine learning, (4 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

eXtremely Boost your machine learning Exercises (Part-1)

#artificialintelligenceSep-25-2017, 06:30:32 GMT

It is very powerful algorithm that use an ensemble of weak learners to obtain a strong learner. Its R implementation is available in xgboost package and it is really worth including into anyone's machine learning portfolio. This is the first part of eXtremely Boost your machine learning series. For other parts follow the tag xgboost. Answers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.

artificial intelligence, machine learning, xgboost, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)

Add feedback

Ensemble Learning to Improve Machine Learning Results

#artificialintelligenceSep-22-2017, 15:40:44 GMT

Ensemble learning helps improve machine learning results by combining several models. This approach allows the production of better predictive performance compared to a single model. That is why ensemble methods placed first in many prestigious machine learning competitions, such as the Netflix Competition, KDD 2009, and Kaggle. The Statsbot team wanted to give you the advantage of this approach and asked a data scientist, Vadim Smolyakov, to dive into three basic ensemble learning techniques. Ensemble methods are meta-algorithms that combine several machine learning techniques into one predictive model in order to decrease variance (bagging), bias (boosting), or improve predictions (stacking).

artificial intelligence, learner, machine learning, (15 more...)

#artificialintelligence

Industry: Leisure & Entertainment (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.76)

Add feedback

Gradient Boosting, Decision Trees and XGBoost with CUDA Parallel Forall

@machinelearnbotSep-20-2017, 19:35:10 GMT

Gradient boosting is a powerful machine learning algorithm used to achieve state-of-the-art accuracy on a variety of tasks such as regression, classification and ranking. It has achieved notice in machine learning competitions in recent years by "winning practically every competition in the structured data category". If you don't use deep neural networks for your problem, there is a good chance you use gradient boosting. In this post I look at the popular gradient boosting algorithm XGBoost and show how to apply CUDA and parallel algorithms to greatly decrease training times in decision tree algorithms. I originally described this approach in my MSc thesis and it has since evolved to become a core part of the open source XGBoost library as well as a part of the H2O GPU Edition by H2O.ai.

algorithm, artificial intelligence, machine learning, (14 more...)

@machinelearnbot

Industry: Information Technology > Hardware (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

How to use XGBoost algorithm in R in easy steps

#artificialintelligenceSep-17-2017, 22:05:23 GMT

Did you know using XGBoost algorithm is one of the popular winning recipe of data science competitions? So, what makes it more powerful than a traditional Random Forest or Neural Network? In the last few years, predictive modeling has become much faster and accurate. I remember spending long hours on feature engineering for improving model by few decimals. A lot of that difficult work, can now be done by using better algorithms.

algorithm, artificial intelligence, machine learning, (13 more...)

#artificialintelligence

Genre: Contests & Prizes (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.76)

Add feedback

Understanding Boosted Trees Models

#artificialintelligenceSep-13-2017, 08:55:21 GMT

In the previous post, we learned about tree based learning methods - basics of tree based models and the use of bagging to reduce variance. We also looked at one of the most famous learning algorithms based on the idea of bagging- random forests. In this post, we will look into the details of yet another type of tree-based learning algorithms: boosted trees. Boosting, similar to Bagging, is a general class of learning algorithm where a set of weak learners are combined to get strong learners. For classification problems, a weak learner is defined to be a classifier which is only slightly correlated with the true classification (it can label examples better than random guessing). In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification. Recall that bagging involves creating multiple copies of the original training data set via bootstrapping, fitting a separate decision tree to each copy, and then combining all of the trees in order to create a single predictive model.

algorithm, artificial intelligence, machine learning, (19 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

How to Set Up Distributed XGBoost on MapR-FS

#artificialintelligenceSep-3-2017, 03:25:09 GMT

XGBoost is a library that is designed for boosted (tree) algorithms. It has become a popular machine learning framework among data science practitioners, especially on Kaggle, which is a platform for data prediction competitions where researchers post their data and statisticians and data miners compete to produce the best models. For structured learning problems on Kaggle, it can be difficult to get into the top 10 without including XGBoost. Typically, data scientists use multi-thread single machines to train XGBoost models. Very few people have deployed XGBoost on a distributed environment and achieved good performance.

artificial intelligence, machine learning, xgboost, (17 more...)

#artificialintelligence

Industry: Education (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

Understanding random forests with randomForestExplainer

#artificialintelligenceSep-1-2017, 12:15:33 GMT

Next, we pass it to the function plot_min_depth_distribution and under default settings obtain obtain a plot of the distribution of minimal depth for top ten variables according to mean minimal depth calculated using top trees (mean_sample "top_trees"). We could also pass our forest directly to the plotting function but if we want to make more than one plot of the minimal depth distribution is more efficient to pass the min_depth_frame to the plotting function so that it will not be calculated again for each plot (this works similarly for other plotting functions of randomForestExplainer). The function plot_min_depth_distribution offers three possibilities when it comes to calculating the mean minimal depth, which differ in he way they treat missing values that appear when a variable is not used for splitting in a tree. Note that the depth of a tree is equal to the length of the longest path from root to leave in this tree. This equals the maximum depth of a variable in this tree plus one, as leaves are by definition not split by any variable.

artificial intelligence, machine learning, minimal depth, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.40)

Add feedback