AITopics

2007.15326

Country:

Europe > United Kingdom > England (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Mexico (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
(2 more...)

arXiv.org Machine LearningJul-28-2020

Surrogate Locally-Interpretable Models with Supervised Machine Learning Algorithms

Hu, Linwei, Chen, Jie, Nair, Vijayan N., Sudjianto, Agus

Supervised Machine Learning (SML) algorithms, such as Gradient Boosting, Random Forest, and Neural Networks, have become popular in recent years due to their superior predictive performance over traditional statistical methods. However, their complexity makes the results hard to interpret without additional tools. There has been a lot of recent work in developing global and local diagnostics for interpreting SML models. In this paper, we propose a locally-interpretable model that takes the fitted ML response surface, partitions the predictor space using model-based regression trees, and fits interpretable main-effects models at each of the nodes. We adapt the algorithm to be efficient in dealing with high-dimensional predictors. While the main focus is on interpretability, the resulting surrogate model also has reasonably good predictive performance.

interaction, node, predictor, (17 more...)

2007.14528

Country:

North America > United States > California (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.72)

arXiv.org Machine LearningJul-28-2020

Supervised Machine Learning Techniques: An Overview with Applications to Banking

Hu, Linwei, Chen, Jie, Vaughan, Joel, Yang, Hanyu, Wang, Kelly, Sudjianto, Agus, Nair, Vijayan N.

This article provides an overview of Supervised Machine Learning (SML) with a focus on applications to banking. The SML techniques covered include Bagging (Random Forest or RF), Boosting (Gradient Boosting Machine or GBM) and Neural Networks (NNs). We begin with an introduction to ML tasks and techniques. This is followed by a description of: i) tree-based ensemble algorithms including Bagging with RF and Boosting with GBMs, ii) Feedforward NNs, iii) a discussion of hyper-parameter optimization techniques, and iv) machine learning interpretability. The paper concludes with a comparison of the features of different ML algorithms. Examples taken from credit risk modeling in banking are used throughout the paper to illustrate the techniques and interpret the results of the algorithms.

algorithm, artificial intelligence, machine learning, (20 more...)

2008.04059

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.47)

Industry: Banking & Finance > Credit (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

#artificialintelligenceJul-25-2020, 10:36:13 GMT

[D] What's the best deep-dive explanation of XGBoost hyperparameters out there?

I'm not a total newbie, so I'd thank for all those "how to get started with xgboost" articles which there are plenty of. I remember having bumped into a site or blog with a great and comprehensive summary of each hyperparameter, but I lost that link and can't find it know from search. As far as I remember, it had a hyperparameter menu on the left, probably referred to all boosting trees and their hyperparameters and was created by some women. Anybody can recall that source?

artificial intelligence, hyperparameter, machine learning, (2 more...)

Industry: Media > News (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.86)

Delcaillau, Dimitri, Ly, Antoine, Vermet, Franck, Papp, Alizé

Interpretabilit\'e des mod\`eles : \'etat des lieux des m\'ethodes et application \`a l'assurance

arXiv.org Machine LearningJul-25-2020

Since May 2018, the General Data Protection Regulation (GDPR) has introduced new obligations to industries. By setting a legal framework, it notably imposes strong transparency on the use of personal data. Thus, people must be informed of the use of their data and must consent the usage of it. Data is the raw material of many models which today make it possible to increase the quality and performance of digital services. Transparency on the use of data also requires a good understanding of its use through different models. The use of models, even if efficient, must be accompanied by an understanding at all levels of the process that transform data (upstream and downstream of a model), thus making it possible to define the relationships between the individual's data and the choice that an algorithm could make based on the analysis of the latter. (For example, the recommendation of one product or one promotional offer or an insurance rate representative of the risk.) Models users must ensure that models do not discriminate against and that it is also possible to explain its result. The widening of the panel of predictive algorithms - made possible by the evolution of computing capacities -- leads scientists to be vigilant about the use of models and to consider new tools to better understand the decisions deduced from them . Recently, the community has been particularly active on model transparency with a marked intensification of publications over the past three years. The increasingly frequent use of more complex algorithms (\textit{deep learning}, Xgboost, etc.) presenting attractive performances is undoubtedly one of the causes of this interest. This article thus presents an inventory of methods of interpreting models and their uses in an insurance context.

artificial intelligence, machine learning, valeur, (18 more...)

2007.12919

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York (0.04)
Europe > France > Brittany > Finistère > Brest (0.04)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)

#artificialintelligenceJul-24-2020, 01:45:45 GMT

What is Bootstrap Sampling in Machine Learning and Why is it Important?

The Bootstrap Sampling Method is a very simple concept and is a building block for some of the more advanced machine learning algorithms like AdaBoost and XGBoost. However, when I started my data science journey, I couldn't quite understand the point of it. So my goals are to explain what the bootstrap method is and why it's important to know! Technically speaking, the bootstrap sampling method is a resampling method that uses random sampling with replacement. Don't worry if that sounded confusing, let me explain it with a diagram: Suppose you have an initial sample with 3 observations.

artificial intelligence, bootstrap, machine learning, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.57)

#artificialintelligenceJul-23-2020, 10:11:15 GMT

Classification with Random Forests in Python

The random forests algorithm is a machine learning method that can be used for supervised learning tasks such as classification and regression. The algorithm works by constructing a set of decision trees trained on random subsets of features. In the case of classification, the output of a random forest model is the mode of the predicted classes across the decision trees. In this post, we will discuss how to build random forest models for classification tasks in python. In this post, you'll see Classification with Random Forests in Python The random forests algorithm is a machine learning method that can be used for supervised learning tasks such as classification and regression.

artificial intelligence, machine learning, random forest model, (11 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

#artificialintelligenceJul-21-2020, 11:30:28 GMT

A complete explanation of Random Forest Algorithm.

Ensemble learning is a technique where there is a joining of different types of algorithm or same types of algorithm and then it forms a more powerful regression and classification model. Here, in the random forest algorithm, it combines with multiple decision trees and forms a model. Because of its diversity and simplicity, it is one of the most used algorithms. It is used for both classification and regression problems.

decision tree learning, machine learning, random forest algorithm, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.83)

#artificialintelligenceJul-21-2020, 06:20:47 GMT

How to Develop a Bagging Ensemble with Python

Bagging is an ensemble machine learning algorithm that combines the predictions from many decision trees. It is also easy to implement given that it has few key hyperparameters and sensible heuristics for configuring these hyperparameters. Bagging performs well in general and provides the basis for a whole field of ensemble of decision tree algorithms such as the popular random forest and extra trees ensemble algorithms, as well as the lesser-known Pasting, Random Subspaces, and Random Patches ensemble algorithms. In this tutorial, you will discover how to develop Bagging ensembles for classification and regression. How to Develop a Bagging Ensemble in Python Photo by daveynin, some rights reserved. Bootstrap Aggregation, or Bagging for short, is an ensemble machine learning algorithm. Specifically, it is an ensemble of decision tree models, although the bagging technique can also be used to combine the predictions of other types of models.

artificial intelligence, ensemble, machine learning, (16 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Bernard, Simon, Cao, Hongliu, Sabourin, Robert, Heutte, Laurent

Random Forest for Dissimilarity-based Multi-view Learning

arXiv.org Machine LearningJul-16-2020

Many classification problems are naturally multi-view in the sense their data are described through multiple heterogeneous descriptions. For such tasks, dissimilarity strategies are effective ways to make the different descriptions comparable and to easily merge them, by (i) building intermediate dissimilarity representations for each view and (ii) fusing these representations by averaging the dissimilarities over the views. In this work, we show that the Random Forest proximity measure can be used to build the dissimilarity representations, since this measure reflects similarities between features but also class membership. We then propose a Dynamic View Selection method to better combine the view-specific dissimilarity representations. This allows to take a decision, on each instance to predict, with only the most relevant views for that instance. Experiments are conducted on several real-world multi-view datasets, and show that the Dynamic View Selection offers a significant improvement in performance compared to the simple average combination and two state-of-the-art static view combinations.

classifier, dissimilarity, matrix, (14 more...)

doi: 10.1142/9789811211072_0007

2007.08377

Country:

Europe > France > Normandy > Seine-Maritime > Rouen (0.05)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.62)