AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Singh, Sameer, Ribeiro, Marco Tulio, Guestrin, Carlos

Programs as Black-Box Explanations

arXiv.org Machine LearningNov-22-2016

Recent work in model-agnostic explanations of black-box machine learning has demonstrated that interpretability of complex models does not have to come at the cost of accuracy or model flexibility. However, it is not clear what kind of explanations, such as linear models, decision trees, and rule lists, are the appropriate family to consider, and different tasks and models may benefit from different kinds of explanations. Instead of picking a single family of representations, in this work we propose to use "programs" as model-agnostic explanations. We show that small programs can be expressive yet intuitive as explanations, and generalize over a number of existing interpretable families. We propose a prototype program induction method based on simulated annealing that approximates the local behavior of black-box classifiers around a specific prediction using random perturbations. Finally, we present preliminary application on small datasets and show that the generated explanations are intuitive and accurate for a number of classifiers.

artificial intelligence, explanation, machine learning, (17 more...)

1611.07579

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.50)

Industry: Transportation > Air (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Goix, Nicolas, Drougard, Nicolas, Brault, Romain, Chiapino, Maël

One Class Splitting Criteria for Random Forests

arXiv.org Machine LearningNov-21-2016

Random Forests (RFs) are strong machine learning tools for classification and regression. However, they remain supervised algorithms, and no extension of RFs to the one-class setting has been proposed, except for techniques based on second-class sampling. This work fills this gap by proposing a natural methodology to extend standard splitting criteria to the one-class setting, structurally generalizing RFs to one-class classification. An extensive benchmark of seven state-of-the-art anomaly detection algorithms is also presented. This empirically demonstrates the relevance of our approach.

artificial intelligence, data mining, machine learning, (16 more...)

1611.01971

Country: Europe > France (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.62)

Tan, Hui Fen, Hooker, Giles, Wells, Martin T.

Tree Space Prototypes: Another Look at Making Tree Ensembles Interpretable

arXiv.org Machine LearningNov-21-2016

Ensembles of decision trees have good prediction accuracy but suffer from a lack of interpretability. We propose a new approach for interpreting tree ensembles by finding prototypes in tree space, utilizing the naturally-learned similarity measure from the tree ensemble. Demonstrating the method on random forests, we show that the method benefits from two unique aspects of tree ensembles by leveraging tree structure to sequentially find prototypes, and utilizing the naturally-learned similarity measure from the tree ensemble. The method provides good prediction accuracy when found prototypes are used in nearest-prototype classifiers, while using fewer prototypes than competitor methods. We are investigating the sensitivity of the method to different prototype-finding procedures and demonstrating it on higher-dimensional data.

artificial intelligence, data mining, machine learning, (13 more...)

1611.07115

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.99)

#artificialintelligenceNov-19-2016, 11:45:19 GMT

Decision Trees Question

Udacity 59 views Show Developer Workflow - Duration: 2:09.

artificial intelligence, machine learning, social media, (12 more...)

Technology:

Information Technology > Communications > Social Media (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.44)

#artificialintelligenceNov-18-2016, 19:00:24 GMT

How to Implement Random Forest From Scratch in Python - Machine Learning Mastery

Decision trees can suffer from high variance which makes their results fragile to the specific training data used. Building multiple models from samples of your training data, called bagging, can reduce this variance, but the trees are highly correlated. Random Forest is an extension of bagging that in addition to building trees based on multiple samples of your training data, it also constrains the features that can be used to build the trees, forcing trees to be different. This, in turn, can give a lift in performance. In this tutorial, you will discover how to implement the Random Forest algorithm from scratch in Python.

artificial intelligence, decision tree learning, machine learning, (14 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Vandewiele, Gilles, Janssens, Olivier, Ongenae, Femke, De Turck, Filip, Van Hoecke, Sofie

GENESIM: genetic extraction of a single, interpretable model

arXiv.org Machine LearningNov-17-2016

Models obtained by decision tree induction techniques excel in being interpretable.However, they can be prone to overfitting, which results in a low predictive performance. Ensemble techniques are able to achieve a higher accuracy. However, this comes at a cost of losing interpretability of the resulting model. This makes ensemble techniques impractical in applications where decision support, instead of decision making, is crucial. To bridge this gap, we present the GENESIM algorithm that transforms an ensemble of decision trees to a single decision tree with an enhanced predictive performance by using a genetic algorithm. We compared GENESIM to prevalent decision tree induction and ensemble techniques using twelve publicly available data sets. The results show that GENESIM achieves a better predictive performance on most of these data sets than decision tree induction techniques and a predictive performance in the same order of magnitude as the ensemble techniques. Moreover, the resulting model of GENESIM has a very low complexity, making it very interpretable, in contrast to ensemble techniques.

algorithm, artificial intelligence, machine learning, (18 more...)

1611.05722

Country: Europe > Spain (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

#artificialintelligenceNov-16-2016, 13:43:00 GMT

The 7 Best Data Science and Machine Learning Podcasts – The Startup

Data science and machine learning have long been interests of mine, but now that I'm working on Fuzzy.io I need to keep on top of all the news in both fields. My preferred way to do this is through listening to podcasts. I've listened to a bunch of machine learning and data science podcasts in the last few months, so I thought I'd share my favorites: Every other week, they release a 10–15 minute episode where hosts, Kyle and Linda Polich give a short primer on topics like k-means clustering, natural language processing and decision tree learning, often using analogies related to their pet parrot, Yoshi. This is the only place where you'll learn about k-means clustering via placement of parrot droppings.

artificial intelligence, decision tree learning, machine learning, (8 more...)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

#artificialintelligenceNov-14-2016, 00:06:10 GMT

How to Implement Random Forest From Scratch in Python - Machine Learning Mastery

artificial intelligence, decision tree learning, machine learning, (14 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Galili, Tal, Meilijson, Isaac

Splitting matters: how monotone transformation of predictor variables may improve the predictions of decision tree models

arXiv.org Machine LearningNov-14-2016

It is widely believed that the prediction accuracy of decision tree models is invariant under any strictly monotone transformation of the individual predictor variables. However, this statement may be false when predicting new observations with values that were not seen in the training-set and are close to the location of the split point of a tree rule. The sensitivity of the prediction error to the split point interpolation is high when the split point of the tree is estimated based on very few observations, reaching 9% misclassification error when only 10 observations are used for constructing a split, and shrinking to 1% when relying on 100 observations. This study compares the performance of alternative methods for split point interpolation and concludes that the best choice is taking the mid-point between the two closest points to the split point of the tree. Furthermore, if the (continuous) distribution of the predictor variable is known, then using its probability integral for transforming the variable ("quantile transformation") will reduce the model's interpolation error by up to about a half on average. Accordingly, this study provides guidelines for both developers and users of decision tree models (including bagging and random forest).

artificial intelligence, estimator, machine learning, (14 more...)

1611.04561

Country:

Asia (0.28)
North America > United States (0.28)
Europe > Austria (0.28)

Genre: Research Report > Experimental Study (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)