AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Verifying Robustness of Gradient Boosted Models

Einziger, Gil, Goldstein, Maayan, Sa'ar, Yaniv, Segall, Itai

arXiv.org Artificial IntelligenceJun-26-2019

Gradient boosted models are a fundamental machine learning technique. Robustness to small perturbations of the input is an important quality measure for machine learning models, but the literature lacks a method to prove the robustness of gradient boosted models. This work introduces VERIGB, a tool for quantifying the robustness of gradient boosted models. VERIGB encodes the model and the robustness property as an SMT formula, which enables state of the art verification tools to prove the model's robustness. We extensively evaluate VERIGB on publicly available datasets and demonstrate Figure 1: Example of the lack of robustness in a gradient a capability for verifying large models. Finally, we show boosted model trained over a traffic signs dataset. In the that some model configurations tend to be inherently more first row, an "80 km/h speed limit" sign is misclassified as robust than others.

artificial intelligence, machine learning, robustness, (17 more...)

arXiv.org Artificial Intelligence

1906.10991

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry: Transportation (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Application of Gradient Boosting in Order Book Modeling

#artificialintelligenceJun-20-2019, 13:59:55 GMT

The basic metric of success is to get the error less than the baseline. It means that the final model has good quality. The first question is how to measure quality. It could be squared errors. After that, we can estimate the interval by bootstrapping method.

artificial intelligence, baseline, machine learning, (11 more...)

#artificialintelligence

Industry: Information Technology (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.40)

Add feedback

Early Detection of Depression: Social Network Analysis and Random Forest Techniques

#artificialintelligenceJun-15-2019, 05:46:53 GMT

Background: Major depressive disorder (MDD) or depression is among the most prevalent psychiatric disorders, affecting more than 300 million people globally. Early detection is critical for rapid intervention, which can potentially reduce the escalation of the disorder. Objective: This study used data from social media networks to explore various methods of early detection of MDDs based on machine learning. We performed a thorough analysis of the dataset to characterize the subjects' behavior based on different aspects of their writings: textual spreading, time gap, and time span. Methods: We proposed 2 different approaches based on machine learning singleton and dual.

analysis and random forest technique, artificial intelligence, machine learning, (8 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.63)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.45)

Add feedback

Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting

Gautheron, Léo, Germain, Pascal, Habrard, Amaury, Morvant, Emilie, Sebban, Marc, Zantedeschi, Valentina

arXiv.org Machine LearningJun-14-2019

We propose a Gradient Boosting algorithm for learning an ensemble of kernel functions adapted to the task at hand. Unlike state-of-the-art Multiple Kernel Learning techniques that make use of a pre-computed dictionary of kernel functions to select from, at each iteration we fit a kernel by approximating it as a weighted sum of Random Fourier Features (RFF) and by optimizing their barycenter. This allows us to obtain a more versatile method, easier to setup and likely to have better performance. Our study builds on a recent result showing one can learn a kernel from RFF by computing the minimum of a PAC-Bayesian bound on the kernel alignment generalization loss, which is obtained efficiently from a closed-form solution. We conduct an experimental analysis to highlight the advantages of our method w.r.t. both Boosting-based and kernel-learning state-of-the-art methods.

artificial intelligence, landmark, machine learning, (16 more...)

arXiv.org Machine Learning

1906.06203

Country:

Europe > France (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

On the Insufficiency of the Large Margins Theory in Explaining the Performance of Ensemble Methods

Martinez, Waldyn, Gray, J. Brian

arXiv.org Machine LearningJun-10-2019

Boosting and other ensemble methods combine a large number of weak classifiers through weighted voting to produce stronger predictive models. To explain the successful performance of boosting algorithms, Schapire et al. (1998) showed that AdaBoost is especially effective at increasing the margins of the training data. Schapire et al. (1998) also developed an upper bound on the generalization error of any ensemble based on the margins of the training data, from which it was concluded that larger margins should lead to lower generalization error, everything else being equal (sometimes referred to as the ``large margins theory''). Tighter bounds have been derived and have reinforced the large margins theory hypothesis. For instance, Wang et al. (2011) suggest that specific margin instances, such as the equilibrium margin, can better summarize the margins distribution. These results have led many researchers to consider direct optimization of the margins to improve ensemble generalization error with mixed results. We show that the large margins theory is not sufficient for explaining the performance of voting classifiers. We do this by illustrating how it is possible to improve upon the margin distribution of an ensemble solution, while keeping the complexity fixed, yet not improve the test set performance.

artificial intelligence, machine learning, mmi 0, (20 more...)

arXiv.org Machine Learning

1906.04063

Country: North America > United States > Alabama (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)

Add feedback

Random Forest vs Neural Network: Which is Better, and When?

#artificialintelligenceJun-8-2019, 05:48:13 GMT

Which is better: Random Forest or Neural Network? This is a common question, with a very easy answer: it depends:). I will try to show you when it is good to use Random Forest and when to use Neural Network. First of all, Random Forest (RF) and Neural Network (NN) are different types of algorithms. The RF is the ensemble of decision trees.

artificial intelligence, machine learning, neural network, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Diabetes Prediction with Ensemble Techniques - DataScienceCentral.com

#artificialintelligenceJun-8-2019, 00:30:00 GMT

Till your good is better and better is best! ENSEMBLE Yes, the above quote is so true. We humans have ability to rate things and find some or the other matrices to measure things and evaluate them or their performances. Similarly, in Data Science you can measure your model’s accuracy and performance! The very first… Read More »Diabetes Prediction with Ensemble Techniques

accuracy, listing, matrix, (10 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.31)

Add feedback

Beware Default Random Forest Importances

#artificialintelligenceJun-7-2019, 15:28:26 GMT

Dependence numbers close to one indicate that the feature is completely predictable using the other features, which means it could be dropped without affecting accuracy. For example, the mean radius is extremely important in predicting mean perimeter and mean area, so we can probably drop those two. It also looks like radius error is important to predicting perimeter error and area error, so we can drop those last two. Mean and worst texture also appear to be dependent, so we can drop one of those too. Similarly, let's drop concavity error and fractal dimension error because compactness error seems to predict them well. Worst radius also predicts worst perimeter and worst area well.

artificial intelligence, beware default random forest importance, decision tree learning, (2 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.45)

Add feedback

The Random Forest Algorithm

#artificialintelligenceJun-7-2019, 15:28:09 GMT

Random Forest is a flexible, easy to use machine learning algorithm that produces, even without hyper-parameter tuning, a great result most of the time. It is also one of the most used algorithms, because it's simplicity and the fact that it can be used for both classification and regression tasks. In this post, you are going to learn, how the random forest algorithm works and several other important things about it. Random Forest is a supervised learning algorithm. Like you can already see from it's name, it creates a forest and makes it somehow random.

artificial intelligence, machine learning, random forest, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Ensemble Pruning via Margin Maximization

Martinez, Waldyn

arXiv.org Machine LearningJun-7-2019

Ensemble models refer to methods that combine a typically large number of classifiers into a compound prediction. The output of an ensemble method is the result of fitting a base-learning algorithm to a given data set, and obtaining diverse answers by reweighting the observations or by resampling them using a given probabilistic selection. A key challenge of using ensembles in large-scale multidimensional data lies in the complexity and the computational burden associated with them. The models created by ensembles are often difficult, if not impossible, to interpret and their implementation requires more computational power than single classifiers. Recent research effort in the field has concentrated in reducing ensemble size, while maintaining their predictive accuracy. We propose a method to prune an ensemble solution by optimizing its margin distribution, while increasing its diversity. The proposed algorithm results in an ensemble that uses only a fraction of the original classifiers, with improved or similar generalization performance. We analyze and test our method on both synthetic and real data sets. The simulations show that the proposed method compares favorably to the original ensemble solutions and to other existing ensemble pruning methodologies.

algorithm, classifier, ensemble, (16 more...)

arXiv.org Machine Learning

1906.03247

Country:

North America > United States > Wisconsin (0.04)
North America > United States > Ohio > Butler County > Oxford (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.97)
(2 more...)

Add feedback