AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Predicting animal adoption with Random Forest, SVM

@machinelearnbotMay-4-2018, 05:35:16 GMT

Joanne Lin, a student at Thinkful's data science bootcamp, decided to jump in and find insights that can help shelters get more pets rescued.

animal adoption, artificial intelligence, decision tree learning, (3 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.40)

Add feedback

An Evaluation of Classification and Outlier Detection Algorithms

Hodge, Victoria J., Austin, Jim

arXiv.org Machine LearningMay-2-2018

This paper evaluates algorithms for classification and outlier detection accuracies in temporal data. We focus on algorithms that train and classify rapidly and can be used for systems that need to incorporate new data regularly. Hence, we compare the accuracy of six fast algorithms using a range of well-known time-series datasets. The analyses demonstrate that the choice of algorithm is task and data specific but that we can derive heuristics for choosing. Gradient Boosting Machines are generally best for classification but there is no single winner for outlier detection though Gradient Boosting Machines (again) and Random Forest are better. Hence, we recommend running evaluations of a number of algorithms using our heuristics.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

1805.00811

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.76)

Add feedback

RFCDE: Random Forests for Conditional Density Estimation

Pospisil, Taylor, Lee, Ann B.

arXiv.org Machine LearningMay-2-2018

Random forests is a common non-parametric regression technique which performs well for mixed-type data and irrelevant covariates, while being robust to monotonic variable transformations. Existing random forest implementations target regression or classification. We introduce the RFCDE package for fitting random forest models optimized for nonparametric conditional density estimation, including joint densities for multiple responses. This enables analysis of conditional probability distributions which is useful for propagating uncertainty and of joint distributions that describe relationships between multiple responses and covariates. RFCDE is released under the MIT open-source license and can be accessed at https://github.com/tpospisi/rfcde . Both R and Python versions, which call a common C++ library, are available.

artificial intelligence, density estimation, machine learning, (15 more...)

arXiv.org Machine Learning

1804.05753

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.96)

Add feedback

Gradient Boosting vs Random Forest – Abolfazl Ravanshad – Medium

#artificialintelligenceApr-28-2018, 09:30:57 GMT

In this post, I am going to compare two popular ensemble methods, Random Forests (RM) and Gradient Boosting Machine (GBM). GBM and RF both are ensemble learning methods and predict (regression or classification) by combining the outputs from individual trees (we assume tree-based GBM or GBT). They have all the strengths and weaknesses of the ensemble methods mentioned in my previous post. So, here we compare them only with respect to each other. GBM and RF differ in the way the trees are built: the order and the way the results are combined.

application, artificial intelligence, machine learning, (7 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)

Add feedback

Boosting and Bagging: How To Develop A Robust Machine Learning Algorithm

#artificialintelligenceApr-27-2018, 23:11:48 GMT

Machine learning and data science require more than just throwing data into a python library and utilizing whatever comes out. Data scientists need to actually understand the data and the processes behind the data to be able to implement a successful system. One key methodology to implementation is knowing when a model might benefit from utilizing bootstrapping methods. These are what are called ensemble models. Some examples of ensemble models are AdaBoost and Stochastic Gradient Boosting. They can help improve algorithm accuracy or improve the robustness of a model.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

Interpretable Machine Learning with XGBoost – Towards Data Science

#artificialintelligenceApr-24-2018, 02:26:57 GMT

This is a story about the danger of interpreting your machine learning model incorrectly, and the value of interpreting it correctly. If you have found the robust accuracy of ensemble tree models such as gradient boosting machines or random forests attractive, but also need to interpret them, then I hope you find this informative and helpful. Imagine we are tasked with predicting a person's financial status for a bank. The more accurate our model, the more money the bank makes, but since this prediction is used for loan applications we are also legally required to provide an explanation for why a prediction was made. After experimenting with several model types, we find that gradient boosted trees as implemented in XGBoost give the best accuracy.

artificial intelligence, feature attribution method, machine learning, (14 more...)

#artificialintelligence

Industry: Banking & Finance (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

An Ensemble Generation MethodBased on Instance Hardness

Walmsley, Felipe N., Cavalcanti, George D. C., Oliveira, Dayvid V. R., Cruz, Rafael M. O., Sabourin, Robert

arXiv.org Artificial IntelligenceApr-19-2018

Abstract--In Machine Learning, ensemble methods have been receiving a great deal of attention. Techniques such as Bagging and Boosting have been successfully applied to a variety of problems. Nevertheless, such techniques are still susceptible to the effects of noise and outliers in the training data. We propose a new method for the generation of pools of classifiers based on Bagging, in which the probability of an instance being selected during the resampling process is inversely proportional to its instance hardness, which can be understood as the likelihood of an instance being misclassified, regardless of the choice of classifier. The goal of the proposed method is to remove noisy data without sacrificing the hard instances which are likely to be found on class boundaries. We evaluate the performance of the method in nineteen public data sets, and compare it to the performance of the Bagging and Random Subspace algorithms. Our experiments show that in high noise scenarios the accuracy of our method is significantly better than that of Bagging. Ensemble methods [1] [2] [3] are techniques that combine multiple predictors trained independently, using a combination of the outputs of each predictor as the final output. This is in contrast to traditional Machine Learning methods, which train a single classifier on the whole of the training set.

algorithm, bagging algorithm, classifier, (14 more...)

arXiv.org Artificial Intelligence

1804.07419

Country:

South America > Brazil > Pernambuco (0.04)
North America > Canada > Quebec (0.04)

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.86)

Add feedback

Can Machine Learning Improve Recession Prediction?

#artificialintelligenceApr-18-2018, 07:22:05 GMT

Big data utilization in economics and the financial world has increased with every passing day. In previous reports, we have discussed issues and opportunities related to big data applications in economics/finance.1 This report outlines a framework to utilize machine learning and statistical data mining tools in the economics/financial world with the goal of more accurately predicting recessions. Decision makers have a vital interest in predicting future recessions in order to enact appropriate policy. Therefore, to help decision makers, we raise the question: Does machine learning and statistical data mining improve recession prediction accuracy?

machine learning, random forest approach, recession prediction, (9 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.30)

Add feedback

Exact Distributed Training: Random Forest with Billions of Examples

Guillame-Bert, Mathieu, Teytaud, Olivier

arXiv.org Machine LearningApr-18-2018

We introduce an exact distributed algorithm to train Random Forest models as well as other decision forest models without relying on approximating best split search. We explain the proposed algorithm and compare it to related approaches for various complexity measures (time, ram, disk, and network complexity analysis). We report its running performances on artificial and real-world datasets of up to 18 billions examples. This figure is several orders of magnitude larger than datasets tackled in the existing literature. Finally, we empirically show that Random Forest benefits from being trained on more data, even in the case of already gigantic datasets. Given a dataset with 17.3B examples with 82 features (3 numerical, other categorical with high arity), our implementation trains a tree in 22h.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Machine Learning

1804.06755

Country:

Europe (0.68)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

LEARNING PATH: R: Machine Learning Algorithms with R

@machinelearnbotApr-16-2018, 14:05:17 GMT

Are you interested to explore advanced algorithm concepts such as random forest vector machine, K- nearest, and more through real-world examples? Packt's Video Learning Paths are a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it. Machine learning and data science are some of the top buzzwords in the technical world today. Machine learning - the application and science of algorithms that makes sense of data, is the most exciting field of all the computer sciences! It explores the study and construction of algorithms that can learn from and make predictions on data.

algorithm, learning path, machine learning algorithm, (3 more...)

@machinelearnbot

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.42)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback