AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Classification in the presence of missing data

#artificialintelligenceAug-20-2016, 02:36:05 GMT

Missing data is quite common when dealing with real world datasets. There are several ways to improve prediction accuracy when missing data in some predictors without completely discarding the entire observation. This example shows how decision trees with surrogate splits can be used to improve prediction accuracy in the presence of missing data. Bagging (bootstrap aggregating), is an ensemble approach which involves training several weak learners to create a strong classifier. Decreasing value with number of trees indicates good performance.

classification, data quality, machine learning, (4 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.31)

Add feedback

XGBoost With Python - Machine Learning Mastery

#artificialintelligenceAug-18-2016, 19:06:11 GMT

XGBoost is the dominant technique for predictive modeling on regular data. The gradient boosting algorithm has proven to be one of the top techniques on a wide range of predictive modeling problems, and the XGBoost implementation has proven to be the fastest available for use in applied machine learning. When asked, the best machine learning competitors in the world recommend using XGBoost. In this new Ebook written in the friendly Machine Learning Mastery style that you're used to, learn exactly how to get started and bring XGBoost to your own machine learning projects. The Gradient Boosting algorithm has been around since 1999. So why is it so popular right now?

artificial intelligence, machine learning, xgboost, (17 more...)

#artificialintelligence

Genre: Instructional Material (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

How to Develop Your First XGBoost Model in Python with scikit-learn - Machine Learning Mastery

#artificialintelligenceAug-18-2016, 19:06:06 GMT

XGBoost is an implementation of gradient boosted decision trees designed for speed and performance that is dominative competitive machine learning. In this post you will discover how you can install and create your first XGBoost model in Python. How to Develop Your First XGBoost Model in Python with scikit-learn Photo by Justin Henry, some rights reserved. XGBoost is the high performance implementation of gradient boosting that you can now access directly in Python. Assuming you have a working SciPy environment, XGBoost can be installed easily using pip.

artificial intelligence, machine learning, xgboost model, (15 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

Improving Predictions with Ensemble Model

#artificialintelligenceAug-16-2016, 01:15:31 GMT

"Alone we can do so little and together we can do much" - a phrase from Helen Keller during 50's is a reflection of achievements and successful stories in real life scenarios from decades. Same thing applies with most of the cases from innovation with big impacts and with advanced technologies world. The machine Learning domain is also in the same race to make predictions and classification in a more accurate way using so called ensemble method and it is proved that ensemble modeling offers one of the most convincing way to build highly accurate predictive models. Ensemble methods are learning models that achieve performance by combining the opinions of multiple learners. Typically, an ensemble model is a supervised learning technique for combining multiple weak learners or models to produce a strong learner with the concept of Bagging and Boosting for data sampling.

artificial intelligence, learner, machine learning, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.55)

Add feedback

Gradient Boosting Interactive Playground

#artificialintelligenceJul-22-2016, 01:41:27 GMT

This is an interactive demonstration-explanation of gradient boosting algorithm applied to classification problem. Boosting takes a decision ('blue' or'orange') by iteratively building many simpler classification algorithms (decision trees in our case). There are many other things about GB you can find out from this demo.

artificial intelligence, interactive playground, machine learning, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.80)

Add feedback

WhizzML: Level Up

#artificialintelligenceJul-5-2016, 16:15:27 GMT

Sure, you can use WhizzML to fill in missing values or to do some basic data cleaning, but what if you want to go crazy? WhizzML is a fully-fledged programming language, after all. We can go as far down the rabbit hole as we want. As we've mentioned before, one of the great things about writing programs in WhizzML is access to highly-scalable, library-free machine learning. To put in another way, cloud-based machine learning operations (learn an ensemble, create a dataset, etc.) are primitives built into the language.

artificial intelligence, machine learning, whizzml, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.33)

Add feedback

Forest Floor Visualizations of Random Forests

Welling, Soeren H., Refsgaard, Hanne H. F., Brockhoff, Per B., Clemmensen, Line H.

arXiv.org Machine LearningJul-4-2016

We propose a novel methodology, forest floor, to visualize and interpret random forest (RF) models. RF is a popular and useful tool for non-linear multi-variate classification and regression, which yields a good trade-off between robustness (low variance) and adaptiveness (low bias). Direct interpretation of a RF model is difficult, as the explicit ensemble model of hundreds of deep trees is complex. Nonetheless, it is possible to visualize a RF model fit by its mapping from feature space to prediction space. Hereby the user is first presented with the overall geometrical shape of the model structure, and when needed one can zoom in on local details. Dimensional reduction by projection is used to visualize high dimensional shapes. The traditional method to visualize RF model structure, partial dependence plots, achieve this by averaging multiple parallel projections. We suggest to first use feature contributions, a method to decompose trees by splitting features, and then subsequently perform projections. The advantages of forest floor over partial dependence plots is that interactions are not masked by averaging. As a consequence, it is possible to locate interactions, which are not visualized in a given projection. Furthermore, we introduce: a goodness-of-visualization measure, use of colour gradients to identify interactions and an out-of-bag cross validated variant of feature contributions.

artificial intelligence, feature contribution, machine learning, (17 more...)

arXiv.org Machine Learning

1605.09196

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Health & Medicine (0.68)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Great machine learning starts with resourceful feature engineering

#artificialintelligenceJul-2-2016, 19:51:35 GMT

I recently read an article in which the winner of a Kaggle Competition was not shy about sharing his technique for winning not one, but several of the analytical competitions. "I always use Gradient Boosting," he said. And then added, "but the key is Feature Engineering." A couple days later, a friend who read the same article called and asked, "What is this Feature Engineering that he's talking about?" It was a timely question, as I was in the process of developing a risk model for a client, and specifically, I was working through the stage of Feature Engineering.

artificial intelligence, feature engineering, machine learning, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.36)

Add feedback

ledell/useR-machine-learning-tutorial

#artificialintelligenceJul-1-2016, 11:11:03 GMT

Instructions for how to install the neccessary software for this tutorial is available here. Data for the tutorial can be downloaded by running ./data/get-data.sh (requires wget). Certain algorithms don't scale well when there are millions of features. For example, decision trees require computing some sort of metric (to determine the splits) on all the feature values (or some fraction of the values as in Random Forest and Stochastic GBM). Therefore, computation time is linear in the number of features. Algorithms can deal with data sparsity (where many of the feature values are zero) in different ways.

algorithm, artificial intelligence, machine learning, (9 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.40)

Add feedback

Combining Gradient Boosting Machines with Collective Inference to Predict Continuous Values

Alodah, Iman, Neville, Jennifer

arXiv.org Machine LearningJul-1-2016

Gradient boosting of regression trees is a competitive procedure for learning predictive models of continuous data that fits the data with an additive non-parametric model. The classic version of gradient boosting assumes that the data is independent and identically distributed. However, relational data with interdependent, linked instances is now common and the dependencies in such data can be exploited to improve predictive performance. Collective inference is one approach to exploit relational correlation patterns and significantly reduce classification error. However, much of the work on collective learning and inference has focused on discrete prediction tasks rather than continuous. %target values has not got that attention in terms of collective inference. In this work, we investigate how to combine these two paradigms together to improve regression in relational domains. Specifically, we propose a boosting algorithm for learning a collective inference model that predicts a continuous target variable. In the algorithm, we learn a basic relational model, collectively infer the target values, and then iteratively learn relational models to predict the residuals. We evaluate our proposed algorithm on a real network dataset and show that it outperforms alternative boosting methods. However, our investigation also revealed that the relational features interact together to produce better predictions.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1607.0011

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.96)

Add feedback