AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

XAI : The 3rd wave of AI- The Machine reveals WHY?

#artificialintelligenceMar-16-2020, 01:45:16 GMT

KernelExplainer:- Kernel SHAP is a method that uses a special weighted linear regression to compute the importance of each feature.

algorithm, lime, prediction, (16 more...)

#artificialintelligence

Country:

North America > United States > California (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Add feedback

A Numerical Transform of Random Forest Regressors corrects Systematically-Biased Predictions

Malhotra, Shipra, Karanicolas, John

arXiv.org Machine LearningMar-16-2020

Over the past decade, random forest models have become widely used as a robust method for high-dimensional data regression tasks. In part, the popularity of these models arises from the fact that they require little hyperparameter tuning and are not very susceptible to overfitting. Random forest regression models are comprised of an ensemble of decision trees that independently predict the value of a (continuous) dependent variable; predictions from each of the trees are ultimately averaged to yield an overall predicted value from the forest. Using a suite of representative real-world datasets, we find a systematic bias in predictions from random forest models. We find that this bias is recapitulated in simple synthetic datasets, regardless of whether or not they include irreducible error (noise) in the data, but that models employing boosting do not exhibit this bias. Here we demonstrate the basis for this problem, and we use the training data to define a numerical transformation that fully corrects it. Application of this transformation yields improved predictions in every one of the real-world and synthetic datasets evaluated in our study.

dataset, prediction, random forest model, (13 more...)

arXiv.org Machine Learning

2003.07445

Country:

North America > United States > Iowa > Story County > Ames (0.14)
North America > United States > Kansas > Douglas County > Lawrence (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Banking & Finance (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Government > Regional Government > North America Government > United States Government (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Add feedback

A Beginner's Guide to Random Forest Hyperparameter Tuning

#artificialintelligenceMar-15-2020, 18:11:37 GMT

Click the link above to go to the article.

beginner, random forest hyperparameter tuning

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.82)

Add feedback

Crime Prediction Using Spatio-Temporal Data

Hossain, Sohrab, Abtahee, Ahmed, Kashem, Imran, Hoque, Mohammed Moshiul, Sarker, Iqbal H.

arXiv.org Machine LearningMar-11-2020

A crime is a punishable offence that is harmful for an individual and his society. It is obvious to comprehend the patterns of criminal activity to prevent them. Research can help society to prevent and solve crime activates. Study shows that only 10 percent offenders commits 50 percent of the total offences. The enforcement team can respond faster if they have early information and pre-knowledge about crime activities of the different points of a city. In this paper, supervised learning technique is used to predict crimes with better accuracy. The proposed system predicts crimes by analyzing data-set that contains records of previously committed crimes and their patterns. The system stands on two main algorithms - i) decision tree, and ii) k-nearest neighbor. Random Forest algorithm and Adaboost are used to increase the accuracy of the prediction. Finally, oversampling is used for better accuracy. The proposed system is feed with a criminal-activity data set of twelve years of San Francisco city.

crime, criminal activity, dataset, (16 more...)

arXiv.org Machine Learning

2003.09322

Country:

North America > United States > California > San Francisco County > San Francisco (0.26)
Asia > Bangladesh (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.56)

Add feedback

Towards Interpretable Deep Neural Networks: An Exact Transformation to Multi-Class Multivariate Decision Trees

Nguyen, Tung D., Kasmarik, Kathryn E., Abbass, Hussein A.

arXiv.org Machine LearningMar-11-2020

Deep neural networks (DNNs) are commonly labelled as black-boxes lacking interpretability; thus, hindering human's understanding of DNNs' behaviors. A need exists to generate a meaningful sequential logic for the production of a specific output. Decision trees exhibit better interpretability and expressive power due to their representation language and the existence of efficient algorithms to generate rules. Growing a decision tree based on the available data could produce larger than necessary trees or trees that do not generalise well. In this paper, we introduce two novel multivariate decision tree (MDT) algorithms for rule extraction from a DNN: an Exact-Convertible Decision Tree (EC-DT) and a Deep C-Net algorithm to transform a neural network with Rectified Linear Unit activation functions into a representative tree which can be used to extract multivariate rules for reasoning. While the EC-DT translates the DNN in a layer-wise manner to represent exactly the decision boundaries implicitly learned by the hidden layers of the network, the Deep C-Net inherits the decompositional approach from EC-DT and combines with a C5 tree learning algorithm to construct the decision rules. The results suggest that while EC-DT is superior in preserving the structure and the accuracy of DNN, C-Net generates the most compact and highly effective trees from DNN. Both proposed MDT algorithms generate rules including combinations of multiple attributes for precise interpretation of decision-making processes.

algorithm, constraint, neural network, (15 more...)

arXiv.org Machine Learning

2003.04675

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Oceania > Australia > Queensland (0.04)
Oceania > Australia > New South Wales (0.04)
(5 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Metafeatures-based Rule-Extraction for Classifiers on Behavioral and Textual Data

Ramon, Yanou, Martens, David, Evgeniou, Theodoros, Praet, Stiene

arXiv.org Artificial IntelligenceMar-10-2020

Machine learning using behavioral and text data can result in highly accurate prediction models, but these are often very difficult to interpret. Linear models require investigating thousands of coefficients, while the opaqueness of nonlinear models makes things even worse. Rule-extraction techniques have been proposed to combine the desired predictive behaviour of complex "black-box" models with explainability. However, rule-extraction in the context of ultra-high-dimensional and sparse data can be challenging, and has thus far received scant attention. Because of the sparsity and massive dimensionality, rule-extraction might fail in their primary explainability goal as the black-box model may need to be replaced by many rules, leaving the user again with an incomprehensible model. To address this problem, we develop and test a rule-extraction methodology based on higher-level, less-sparse "metafeatures". We empirically validate the quality of the rules in terms of fidelity, explanation stability and accuracy over a collection of data sets, and benchmark their performance against rules extracted using the original features. Our analysis points to key trade-offs between explainability, fidelity, accuracy, and stability that Machine Learning researchers and practitioners need to consider. Results indicate that the proposed metafeatures approach leads to better trade-offs between these, and is better able to mimic the black-box model. There is an average decrease of the loss in fidelity, accuracy, and stability from using metafeatures instead of the original fine-grained features by respectively 18.08%, 20.15% and 17.73%, all statistically significant at a 5% significance level. Metafeatures thus improve a key "cost of explainability", which we define as the loss in fidelity when replacing a black-box with an explainable model.

fidelity, metafeature, stability, (17 more...)

arXiv.org Artificial Intelligence

2003.04792

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Orange County > Irvine (0.14)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Law (1.00)
Information Technology > Services (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

ENTMOOT: A Framework for Optimization over Ensemble Tree Models

Thebelt, Alexander, Kronqvist, Jan, Mistry, Miten, Lee, Robert M., Sudermann-Merx, Nathan, Misener, Ruth

arXiv.org Artificial IntelligenceMar-10-2020

Gradient boosted trees and other regression tree models perform well in a wide range of real-world, industrial applications. These tree models (i) offer insight into important prediction features, (ii) effectively manage sparse data, and (iii) have excellent prediction capabilities. Despite their advantages, they are generally unpopular for decision-making tasks and black-box optimization, which is due to their difficult-to-optimize structure and the lack of a reliable uncertainty measure. ENTMOOT is our new framework for integrating (already trained) tree models into larger optimization problems. The contributions of ENTMOOT include: (i) explicitly introducing a reliable uncertainty measure that is compatible with tree models, (ii) solving the larger optimization problems that incorporate these uncertainty aware tree models, (iii) proving that the solutions are globally optimal, i.e. no better solution exists. In particular, we show how the ENTMOOT approach allows a simple integration of tree models into decision-making and black-box optimization, where it proves as a strong competitor to commonly-used frameworks.

entmoot, optimization, uncertainty measure, (15 more...)

arXiv.org Artificial Intelligence

2003.04774

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Materials > Chemicals (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Decision Trees Explained

#artificialintelligenceMar-9-2020, 06:24:47 GMT

In this post, I will explain Decision Trees in simple terms. It could be considered a Decision Trees for dummies post, however, I've never really liked that expression. In the Machine Learning world, Decision Trees are a kind of non parametric models, that can be used for both classification and regression. This means that Decision trees are flexible models that don't increase their number of parameters as we add more features (if we build them correctly), and they can either output a categorical prediction (like if a plant is of a certain kind or not) or a numerical prediction (like the price of a house). They are constructed using two kinds of elements: nodes and branches.

decision tree, leave node, node, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

A Human-Centered Review of the Algorithms used within the U.S. Child Welfare System

Saxena, Devansh, Badillo-Urquiola, Karla, Wisniewski, Pamela J., Guha, Shion

arXiv.org Artificial IntelligenceMar-7-2020

The U.S. Child Welfare System (CWS) is charged with improving outcomes for foster youth; yet, they are overburdened and underfunded. To overcome this limitation, several states have turned towards algorithmic decision-making systems to reduce costs and determine better processes for improving CWS outcomes. Using a human-centered algorithmic design approach, we synthesize 50 peer-reviewed publications on computational systems used in CWS to assess how they were being developed, common characteristics of predictors used, as well as the target outcomes. We found that most of the literature has focused on risk assessment models but does not consider theoretical approaches (e.g., child-foster parent matching) nor the perspectives of caseworkers (e.g., case notes). Therefore, future algorithms should strive to be context-aware and theoretically robust by incorporating salient factors identified by past research. We provide the HCI community with research avenues for developing human-centered algorithms that redirect attention towards more equitable outcomes for CWS.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2003.03541

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
North America > United States > Illinois (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(3 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)

Add feedback

Prediction with Spatio-temporal Point Processes with Self Organizing Decision Trees

Karaahmetoglu, Oguzhan, Kozat, Suleyman Serdar

arXiv.org Machine LearningMar-7-2020

We study the spatio-temporal prediction problem, which has attracted attention of many researchers due to its critical real-life applications. In particular, we introduce a novel approach to this problem. Our approach is based on the Hawkes process, which is a non-stationary and self-exciting point process. We extend the formulations of a standard point process model that can represent time-series data to represent a spatio-temporal data. We model the data as nonstationary in time and space. Furthermore, we partition the spatial region we are working on into subregions via an adaptive decision tree and model the source statistics in each subregion with individual but mutually interacting point processes. We also provide a gradient based joint optimization algorithm for the point process and decision tree parameters. Thus, we introduce a model that can jointly infer the source statistics and an adaptive partitioning of the spatial region. Finally, we provide experimental results on a real-life data, which provides significant improvement due to space adaptation and joint optimization compared to standard well-known methods in the literature.

intensity, model parameter, point process, (15 more...)

arXiv.org Machine Learning

2003.03657

Country:

Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.84)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.82)

Add feedback