AITopics | Ensemble Learning

Collaborating Authors

Ensemble Learning

Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Boosting and Bagging: How To Develop A Robust Machine Learning Algorithm

#artificialintelligenceJul-13-2020, 09:56:34 GMT

Machine learning and data science require more than just throwing data into a python library and utilizing whatever comes out. Data scientists need to actually understand the data and the processes behind the data to be able to implement a successful system. One key methodology to implementation is knowing when a model might benefit from utilizing bootstrapping methods. These are what are called ensemble models. Some examples of ensemble models are AdaBoost and Stochastic Gradient Boosting.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

Machine Learning Cheat Sheet (for scikit-learn)

#artificialintelligenceJul-13-2020, 07:48:10 GMT

As you hopefully have heard, we at scikit-learn are doing a user survey (which is still open by the way). One of the requests there was to provide some sort of flow chart on how to do machine learning. As this is clearly impossible, I went to work straight away. This is the result: [edit2] clarification: With ensemble classifiers and ensemble regressors I mean random forests, extremely randomized trees, gradient boosted trees, and the soon-to-be-come weight boosted trees (adaboost). More seriously: this is actually my work flow / train of thoughts whenever I try to solve a new problem.

artificial intelligence, machine learning cheat sheet

#artificialintelligence

Country: North America > United States (0.07)

Genre: Workflow (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.60)

Add feedback

Feature Interactions in XGBoost

Goyal, Kshitij, Dumancic, Sebastijan, Blockeel, Hendrik

arXiv.org Machine LearningJul-11-2020

In this paper, we investigate how feature interactions can be identified to be used as constraints in the gradient boosting tree models using XGBoost's implementation. Our results show that accurate identification of these constraints can help improve the performance of baseline XGBoost model significantly. Further, the improvement in the model structure can also lead to better interpretability.

artificial intelligence, interaction, machine learning, (17 more...)

arXiv.org Machine Learning

2007.05758

Country: Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)

Genre: Research Report > New Finding (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

Random Ensemble Machine Learning in Python: Random Udemy

#artificialintelligenceJul-9-2020, 12:07:23 GMT

Ensemble Machine Learning in Python: Random Forest, AdaBoost 4.6 (1,193 ratings) Course Ratings are calculated from individual students' ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately. In recent years, we've seen a resurgence in AI, or artificial intelligence, and machine learning. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts. Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning. Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever.

artificial intelligence, machine learning, reinforcement learning, (5 more...)

#artificialintelligence

Industry:

Information Technology (1.00)
Leisure & Entertainment > Games (0.61)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.64)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.61)
(2 more...)

Add feedback

StructureBoost: Efficient Gradient Boosting for Structured Categorical Variables

Lucena, Brian

arXiv.org Machine LearningJul-8-2020

Gradient boosting methods based on Structured Categorical Decision Trees (SCDT) have been demonstrated to outperform numerical and one-hot-encodings on problems where the categorical variable has a known underlying structure. However, the enumeration procedure in the SCDT is infeasible except for categorical variables with low or moderate cardinality. We propose and implement two methods to overcome the computational obstacles and efficiently perform Gradient Boosting on complex structured categorical variables. The resulting package, called StructureBoost, is shown to outperform established packages such as CatBoost and LightGBM on problems with categorical predictors that contain sophisticated structure. Moreover, we demonstrate that StructureBoost can make accurate predictions on unseen categorical values due to its knowledge of the underlying structure.

allowable split, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2007.04446

Country:

North America > United States > California > Riverside County (0.14)
North America > United States > Oregon (0.04)
North America > United States > District of Columbia (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance > Insurance (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)

Add feedback

A Novel Random Forest Dissimilarity Measure for Multi-View Learning

Cao, Hongliu, Bernard, Simon, Sabourin, Robert, Heutte, Laurent

arXiv.org Machine LearningJul-6-2020

Multi-view learning is a learning task in which data is described by several concurrent representations. Its main challenge is most often to exploit the complementarities between these representations to help solve a classification/regression task. This is a challenge that can be met nowadays if there is a large amount of data available for learning. However, this is not necessarily true for all real-world problems, where data are sometimes scarce (e.g. problems related to the medical environment). In these situations, an effective strategy is to use intermediate representations based on the dissimilarities between instances. This work presents new ways of constructing these dissimilarity representations, learning them from data with Random Forest classifiers. More precisely, two methods are proposed, which modify the Random Forest proximity measure, to adapt it to the context of High Dimension Low Sample Size (HDLSS) multi-view classification problems. The second method, based on an Instance Hardness measurement, is significantly more accurate than other state-of-the-art measurements including the original RF Proximity measurement and the Large Margin Nearest Neighbor (LMNN) metric learning measurement.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Machine Learning

2007.02572

Country:

Europe > France > Normandy > Seine-Maritime > Rouen (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.83)

Add feedback

Gradient Boosting Hyperparameters Tuning : Classifier Example

#artificialintelligenceJul-5-2020, 23:50:49 GMT

There are various machine learning algorithms that at the last make a weak model. You think to apply other algorithms and still, you get the weak model. If I say there is a method to make all the weak models to a strong model, then do you believe it. At first, you will not believe it, but After reading the entire post you will definitely learn the method to convert the weak model to a strong model using boosting. You will know to tune the Gradient Boosting Hyperparameters. Boosting is an ensemble method to aggregate all the weak models to make them better and the strong model.

artificial intelligence, machine learning, weak model, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.86)

Add feedback

Battle of the Boosters

#artificialintelligenceJul-5-2020, 08:00:56 GMT

We have come a long way in the world of Gradient Boosting. If you have followed the whole series, you should have a much better understanding about the theory and practical aspects of the major algorithms in this space. After a grim walk through the math and theory behind these algorithms, I thought it would be a fun change to see all of them in action in a highly practical blog post. I have chosen a few datasets for regression from Kaggle Datasets, mainly because it's easy to setup and run in Google Colab. Another reason is that I do not need to spend a lot of time in data preprocessing, instead I can pick one of the public kernels and get cracking.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.43)

Add feedback

A Novel Multi-Step Finite-State Automaton for Arbitrarily Deterministic Tsetlin Machine Learning

Abeyrathna, K. Darshana, Granmo, Ole-Christoffer, Shafik, Rishad, Yakovlev, Alex, Wheeldon, Adrian, Lei, Jie, Goodwin, Morten

arXiv.org Artificial IntelligenceJul-4-2020

Due to the high energy consumption and scalability challenges of deep learning, there is a critical need to shift research focus towards dealing with energy consumption constraints. Tsetlin Machines (TMs) are a recent approach to machine learning that has demonstrated significantly reduced energy usage compared to neural networks alike, while performing competitively accuracy-wise on several benchmarks. However, TMs rely heavily on energy-costly random number generation to stochastically guide a team of Tsetlin Automata to a Nash Equilibrium of the TM game. In this paper, we propose a novel finite-state learning automaton that can replace the Tsetlin Automata in TM learning, for increased determinism. The new automaton uses multi-step deterministic state jumps to reinforce sub-patterns. Simultaneously, flipping a coin to skip every $d$'th state update ensures diversification by randomization. The $d$-parameter thus allows the degree of randomization to be finely controlled. E.g., $d=1$ makes every update random and $d=\infty$ makes the automaton completely deterministic. Our empirical results show that, overall, only substantial degrees of determinism reduces accuracy. Energy-wise, random number generation constitutes switching energy consumption of the TM, saving up to 11 mW power for larger datasets with high $d$ values. We can thus use the new $d$-parameter to trade off accuracy against energy consumption, to facilitate low-energy machine learning.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2007.02114

Country: Europe > Norway (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area (0.52)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)

Add feedback

Secure Collaborative XGBoost on Encrypted Data

#artificialintelligenceJul-3-2020, 15:41:36 GMT

Training a machine learning model requires a large quantity of high-quality data. One way to achieve this is to combine data from many different data organizations or data owners. But data owners are often unwilling to share their data with each other due to privacy concerns, which can stem from business competition, or be a matter of regulatory compliance. The question is: how can we mitigate such privacy concerns? Secure collaborative learning enables many data owners to build robust models on their collective data, but without revealing their data to each other.

artificial intelligence, machine learning, xgboost, (16 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.62)

Add feedback