AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Assessing the Local Interpretability of Machine Learning Models

Friedler, Sorelle A., Roy, Chitradeep Dutta, Scheidegger, Carlos, Slack, Dylan

arXiv.org Machine LearningFeb-9-2019

The increasing adoption of machine learning tools has led to calls for accountability via model interpretability. But what does it mean for a machine learning model to be interpretable by humans, and how can this be assessed? We focus on two definitions of interpretability that have been introduced in the machine learning literature: simulatability (a user's ability to run a model on a given input) and "what if" local explainability (a user's ability to correctly indicate the outcome to a model under local changes to the input). Through a user study with 1000 participants, we test whether humans perform well on tasks that mimic the definitions of simulatability and "what if" local explainability on models that are typically considered locally interpretable. We find evidence consistent with the common intuition that decision trees and logistic regression models are interpretable and are more interpretable than neural networks. We propose a metric - the runtime operation count on the simulatability task - to indicate the relative interpretability of models and show that as the number of operations increases the users' accuracy on the local interpretability tasks decreases.

decision tree, interpretability, neural network, (11 more...)

arXiv.org Machine Learning

1902.03501

Country:

Europe (0.28)
North America > United States > Arizona > Pima County > Tucson (0.14)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.72)

Add feedback

A Comprehensive Guide to Decision Tree Learning

#artificialintelligenceFeb-8-2019, 02:15:04 GMT

Decision Tree is one of the most widely used supervised machine learning algorithm (a dataset which has been labeled) for inductive inference. Decision tree learning is a method for approximating discrete valued target functions in which the function which is learned during the training is represented by a decision tree. The learned tree can also be represented as nested if-else rule to improve human readability. Decision tree learning is used for classification as well as regression is often called as classification tree and regression tree respectively. The term Classification And Regression Tree (CART) analysis is used to refer both the tasks.

dataset, decision tree, subset, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Machine learning and chord based feature engineering for genre prediction in popular Brazilian music

Wundervald, Bruna D., Zeviani, Walmes M.

arXiv.org Machine LearningFeb-8-2019

Music genre can be hard to describe: many factors are involved, such as style, music technique, and historical context. Some genres even have overlapping characteristics. Looking for a better understanding of how music genres are related to musical harmonic structures, we gathered data about the music chords for thousands of popular Brazilian songs. Here, 'popular' does not only refer to the genre named MPB (Brazilian Popular Music) but to nine different genres that were considered particular to the Brazilian case. The main goals of the present work are to extract and engineer harmonically related features from chords data and to use it to classify popular Brazilian music genres towards establishing a connection between harmonic relationships and Brazilian genres. We also emphasize the generalisation of the method for obtaining the data, allowing for the replication and direct extension of this work. Our final model is a combination of multiple classification trees, also known as the random forest model. We found that features extracted from harmonic elements can satisfactorily predict music genre for the Brazilian case, as well as features obtained from the Spotify API. The variables considered in this work also give an intuition about how they relate to the genres.

chord, information, music genre, (16 more...)

arXiv.org Machine Learning

1902.03283

Country:

South America > Brazil (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report (0.82)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)

Add feedback

Modeling Heterogeneity in Mode-Switching Behavior Under a Mobility-on-Demand Transit System: An Interpretable Machine Learning Approach

Zhao, Xilei, Yan, Xiang, Van Hentenryck, Pascal

arXiv.org Machine LearningFeb-7-2019

Recent years have witnessed an increased focus on interpretability and the use of machine learning to inform policy analysis and decision making. This paper applies machine learning to examine travel behavior and, in particular, on modeling changes in travel modes when individuals are presented with a novel (on-demand) mobility option. It addresses the following question: Can machine learning be applied to model individual taste heterogeneity (preference heterogeneity for travel modes and response heterogeneity to travel attributes) in travel mode choice? This paper first develops a high-accuracy classifier to predict mode-switching behavior under a hypothetical Mobility-on-Demand Transit system (i.e., stated-preference data), which represents the case study underlying this research. We show that this classifier naturally captures individual heterogeneity available in the data. Moreover, the paper derives insights on heterogeneous switching behaviors through the generation of marginal effects and elasticities by current travel mode, partial dependence plots, and individual conditional expectation plots. The paper also proposes two new model-agnostic interpretation tools for machine learning, i.e., conditional partial dependence plots and conditional individual partial dependence plots, specifically designed to examine response heterogeneity. The results on the case study show that the machine-learning classifier, together with model-agnostic interpretation tools, provides valuable insights on travel mode switching behavior for different individuals and population segments. For example, the existing drivers are more sensitive to additional pickups than people using other travel modes, and current transit users are generally willing to share rides but reluctant to take any additional transfers.

heterogeneity, mod transit, travel mode, (13 more...)

arXiv.org Machine Learning

1902.02904

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Africa > South Africa > Western Cape > Cape Town (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Infrastructure & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)

Add feedback

Data Science in 90 Seconds: kNN - DATAVERSITY

#artificialintelligenceFeb-2-2019, 20:46:52 GMT

Click to learn more about video blogger Laura Kahn. This is Lesson 11 in the Data Science in 90 Seconds video blog series from host Laura Kahn. The series covers some of the most prominent questions in Data Science such as Supervised and Unsupervised Learning, K-Means Clustering, Naive Bayes, Decision Trees and Random Forests, Ridge Regression, kNN and more.

data science, decision tree learning, machine learning, (4 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.82)

Add feedback

Machine Learning: Choosing a Machine Learning Model

#artificialintelligenceJan-31-2019, 13:54:58 GMT

As more companies look to leverage their data using the predictive capabilities of machine learning, they find that there is no one size fits all approach to this exciting technology. The machine learning algorithm you choose depends on the size, quality, and type of data as well as the project timeline and your overall goals. Choosing the proper machine learning algorithm lends context to the insights gained from the resulting predictions. Accuracy: Is the goal of your project to determine the most accurate result or will an approximation satisfy your project needs? Approximating outputs can reduce processing time and keep performance high for large datasets.

artificial intelligence, machine learning, prediction, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.37)

Add feedback

Learning Triggers for Heterogeneous Treatment Effects

Tran, Christopher, Zheleva, Elena

arXiv.org Machine LearningJan-31-2019

The causal effect of a treatment can vary from person to person based on their individual characteristics and predispositions. Mining for patterns of individual-level effect differences, a problem known as heterogeneous treatment effect estimation, has many important applications, from precision medicine to recommender systems. In this paper we define and study a variant of this problem in which an individual-level threshold in treatment needs to be reached, in order to trigger an effect. One of the main contributions of our work is that we do not only estimate heterogeneous treatment effects with fixed treatments but can also prescribe individualized treatments. We propose a tree-based learning method to find the heterogeneity in the treatment effects. Our experimental results on multiple datasets show that our approach can learn the triggers better than existing approaches.

dataset, estimation, treatment effect, (15 more...)

arXiv.org Machine Learning

1902.00087

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Education > Curriculum > Subject-Specific Education (0.41)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

An Evaluation of the Human-Interpretability of Explanation

Lage, Isaac, Chen, Emily, He, Jeffrey, Narayanan, Menaka, Kim, Been, Gershman, Sam, Doshi-Velez, Finale

arXiv.org Machine LearningJan-30-2019

Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions. However, exactly what kinds of explanation are truly human-interpretable remains poorly understood. This work advances our understanding of what makes explanations interpretable under three specific tasks that users may perform with machine learning systems: simulation of the response, verification of a suggested response, and determining whether the correctness of a suggested response changes under a change to the inputs. Through carefully controlled human-subject experiments, we identify regularizers that can be used to optimize for the interpretability of machine learning systems. Our results show that the type of complexity matters: cognitive chunks (newly defined concepts) affect performance more than variable repetitions, and these trends are consistent across tasks and domains. This suggests that there may exist some common design principles for explanation systems.

experiment, explanation, response time, (15 more...)

arXiv.org Machine Learning

1902.00006

Country:

South America (0.14)
North America > Canada (0.04)
Oceania > Australia (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
(2 more...)

Add feedback

Classifier Suites for Insider Threat Detection

Noever, David

arXiv.org Machine LearningJan-30-2019

Better methods to detect insider threats need new anticipatory analytics to capture risky behavior prior to losing data. In search of the best overall classifier, this work empirically scores 88 machine learning algorithms in 16 major families. We extract risk features from the large CERT dataset, which blends real network behavior with individual threat narratives. We discover the predictive importance of measuring employee sentiment. Among major classifier families tested on CERT, the random forest algorithms offer the best choice, with different implementations scoring over 98% accurate. In contrast to more obscure or black-box alternatives, random forests are ensembles of many decision trees and thus offer a deep but human-readable set of detection rules (>2000 rules). We address performance rankings by penalizing long execution times against higher median accuracies using cross-fold validation. We address the relative rarity of threats as a case of low signal-to-noise (< 0.02% malicious to benign activities), and then train on both under-sampled and over-sampled data which is statistically balanced to identify nefarious actors.

algorithm, dataset, detection, (12 more...)

arXiv.org Machine Learning

1901.10948

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Fairwashing: the risk of rationalization

Aïvodji, Ulrich, Arai, Hiromi, Fortineau, Olivier, Gambs, Sébastien, Hara, Satoshi, Tapp, Alain

arXiv.org Machine LearningJan-28-2019

Black-box explanation is the problem of explaining how a machine learning model -- whose internal logic is hidden to the auditor and generally complex -- produces its outcomes. Current approaches for solving this problem include model explanation, outcome explanation as well as model inspection. While these techniques can be beneficial by providing interpretability, they can be used in a negative manner to perform fairwashing, which we define as promoting the perception that a machine learning model respects some ethical values while it might not be the case. In particular, we demonstrate that it is possible to systematically rationalize decisions taken by an unfair black-box model using the model explanation as well as the outcome explanation approaches with a given fairness metric. Our solution, LaundryML, is based on a regularized rule list enumeration algorithm whose objective is to search for fair rule lists approximating an unfair black-box model. We empirically evaluate our rationalization technique on black-box models trained on real-world datasets and show that one can obtain rule lists with high fidelity to the black-box model while being considerably less unfair at the same time.

explanation, rationalization, unfairness, (15 more...)

arXiv.org Machine Learning

1901.09749

Country:

Europe (0.28)
North America > Canada > Quebec > Montreal (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
(4 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.94)

Add feedback