AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

📱Adversarial Attacks on SMS Spam Detectors

#artificialintelligenceAug-19-2020, 11:02:02 GMT

Note: The methodology behind the approach discussed in this post stems from a collaborative publication between myself and Irene Anthi. Spam SMS text messages often show up unexpectedly on our phone screens. That's aggravating enough, but it gets worse. Whoever is sending you a spam text message is usually trying to defraud you. Most spam text messages don't come from another phone.

adversarial sample, machine learning, natural language, (16 more...)

#artificialintelligence

Industry:

Telecommunications (1.00)
Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.32)

Add feedback

The Aha! Moments In 4 Popular Machine Learning Algorithms

#artificialintelligenceAug-19-2020, 07:30:15 GMT

Each step, the Decision Tree algorithm attempts to find a method to build the tree such that the entropy is minimized. Think of entropy more formally as the amount of'disorder' or'confusion' a certain divider (the conditions) has, and its opposite as'information gain' -- how much a divider adds information and insight to the model. Feature splits that have the highest information gain (as well as a lowest entropy) are placed at the top. Note that condition 1 has clean separation, and therefore low entropy and high information gain. The same cannot be said for condition 3, which is why it is placed near the bottom of the Decision Tree.

artificial intelligence, machine learning, neural network, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.64)

Add feedback

Rethinking Default Values: a Low Cost and Efficient Strategy to Define Hyperparameters

Mantovani, Rafael Gomes, Rossi, André Luis Debiaso, Alcobaça, Edesio, Gertrudes, Jadson Castro, Junior, Sylvio Barbon, de Carvalho, André Carlos Ponce de Leon Ferreira

arXiv.org Machine LearningAug-19-2020

Machine Learning (ML) algorithms have been successfully employed by a vast range of practitioners with different backgrounds. One of the reasons for ML popularity is the capability to consistently delivers accurate results, which can be further boosted by adjusting hyperparameters (HP). However, part of practitioners has limited knowledge about the algorithms and does not take advantage of suitable HP settings. In general, HP values are defined by trial and error, tuning, or by using default values. Trial and error is very subjective, time costly and dependent on the user experience. Tuning techniques search for HP values able to maximize the predictive performance of induced models for a given dataset, but with the drawback of a high computational cost and target specificity. To avoid tuning costs, practitioners use default values suggested by the algorithm developer or by tools implementing the algorithm. Although default values usually result in models with acceptable predictive performance, different implementations of the same algorithm can suggest distinct default values. To maintain a balance between tuning and using default values, we propose a strategy to generate new optimized default values. Our approach is grounded on a small set of optimized values able to obtain predictive performance values better than default settings provided by popular tools. The HP candidates are estimated through a pool of promising values tuned from a small and informative set of datasets. After performing a large experiment and a careful analysis of the results, we concluded that our approach delivers better default values. Besides, it leads to competitive solutions when compared with the use of tuned values, being easier to use and having a lower cost.Based on our results, we also extracted simple rules to guide practitioners in deciding whether using our new methodology or a tuning approach.

data mining, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Machine Learning

2008.00025

Country:

South America > Brazil > São Paulo (0.04)
North America > United States > New York (0.04)
North America > United States > California > Orange County > Irvine (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.93)
Information Technology > Data Science > Data Mining (0.93)
(3 more...)

Add feedback

Score-Based Explanations in Data Management and Machine Learning

Bertossi, Leopoldo

arXiv.org Artificial IntelligenceAug-18-2020

We describe some approaches to explanations for observed outcomes in data management and machine learning. They are based on the assignment of numerical scores to predefined and potentially relevant inputs. More specifically, we consider explanations for query answers in databases, and for results from classification models. The described approaches are mostly of a causal and counterfactual nature. We argue for the need to bring domain and semantic knowledge into score computations; and suggest some ways to do this.

explanation, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2007.12799

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.36)

Add feedback

Feature Selection Methods for Cost-Constrained Classification in Random Forests

Jagdhuber, Rudolf, Lang, Michel, Rahnenführer, Jörg

arXiv.org Machine LearningAug-17-2020

Cost-sensitive feature selection describes a feature selection problem, where features raise individual costs for inclusion in a model. These costs allow to incorporate disfavored aspects of features, e.g. failure rates of as measuring device, or patient harm, in the model selection process. Random Forests define a particularly challenging problem for feature selection, as features are generally entangled in an ensemble of multiple trees, which makes a post hoc removal of features infeasible. Feature selection methods therefore often either focus on simple pre-filtering methods, or require many Random Forest evaluations along their optimization path, which drastically increases the computational complexity. To solve both issues, we propose Shallow Tree Selection, a novel fast and multivariate feature selection method that selects features from small tree structures. Additionally, we also adapt three standard feature selection algorithms for cost-sensitive learning by introducing a hyperparameter-controlled benefit-cost ratio criterion (BCR) for each method. In an extensive simulation study, we assess this criterion, and compare the proposed methods to multiple performance-based baseline alternatives on four artificial data settings and seven real-world data settings. We show that all methods using a hyperparameterized BCR criterion outperform the baseline alternatives. In a direct comparison between the proposed methods, each method indicates strengths in certain settings, but no one-fits-all solution exists. On a global average, we could identify preferable choices among our BCR based methods. Nevertheless, we conclude that a practical analysis should never rely on a single method only, but always compare different approaches to obtain the best results.

artificial intelligence, machine learning, selection, (15 more...)

arXiv.org Machine Learning

2008.06298

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > Wisconsin (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.84)

Add feedback

Towards Faithful and Meaningful Interpretable Representations

Sokol, Kacper, Flach, Peter

arXiv.org Artificial IntelligenceAug-16-2020

Interpretable representations are the backbone of many black-box explainers. They translate the low-level data representation necessary for good predictive performance into high-level human-intelligible concepts used to convey the explanation. Notably, the explanation type and its cognitive complexity are directly controlled by the interpretable representation, allowing to target a particular audience and use case. However, many explainers that rely on interpretable representations overlook their merit and fall back on default solutions, which may introduce implicit assumptions, thereby degrading the explanatory power of such techniques. To address this problem, we study properties of interpretable representations that encode presence and absence of human-comprehensible concepts. We show how they are operationalised for tabular, image and text data, discussing their strengths and weaknesses. Finally, we analyse their explanatory properties in the context of tabular data, where a linear model is used to quantify the importance of interpretable concepts.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2008.07007

Country:

North America > United States > Wisconsin (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.46)

Add feedback

How little data do we need for patient-level prediction?

John, Luis H., Kors, Jan A., Reps, Jenna M., Ryan, Patrick B., Rijnbeek, Peter R.

arXiv.org Machine LearningAug-14-2020

Objective: Provide guidance on sample size considerations for developing predictive models by empirically establishing the adequate sample size, which balances the competing objectives of improving model performance and reducing model complexity as well as computational requirements. Materials and Methods: We empirically assess the effect of sample size on prediction performance and model complexity by generating learning curves for 81 prediction problems in three large observational health databases, requiring training of 17,248 prediction models. The adequate sample size was defined as the sample size for which the performance of a model equalled the maximum model performance minus a small threshold value. Results: The adequate sample size achieves a median reduction of the number of observations between 9.5% and 78.5% for threshold values between 0.001 and 0.02. The median reduction of the number of predictors in the models at the adequate sample size varied between 8.6% and 68.3%, respectively. Discussion: Based on our results a conservative, yet significant, reduction in sample size and model complexity can be estimated for future prediction work. Though, if a researcher is willing to generate a learning curve a much larger reduction of the model complexity may be possible as suggested by a large outcome-dependent variability. Conclusion: Our results suggest that in most cases only a fraction of the available data was sufficient to produce a model close to the performance of one developed on the full data set, but with a substantially reduced model complexity. 1 Background and significance Physicians infer diagnoses, prognoses, and treatment pathways based on the available medical history of their patients and the current clinical guidelines. Clinical prediction models can support this process by providing risk information on disease presence and progression.[1,2]

artificial intelligence, machine learning, modeling & simulation, (18 more...)

arXiv.org Machine Learning

2008.07361

Country:

Europe > Netherlands > South Holland > Rotterdam (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(5 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)

Add feedback

Earth Engine Tutorial #32: Machine Learning with Earth Engine - Supervised Classification

#artificialintelligenceAug-11-2020, 12:46:08 GMT

This tutorial shows you how to perform supervised classification (e.g., Classification and Regression Trees [CART]) in Earth Engine. The Classifier package handles supervised classification by traditional ML algorithms running in Earth Engine. The training data is a FeatureCollection with a property storing the class label and properties storing predictor variables. Class labels should be consecutive, integers starting from 0. If necessary, use remap() to convert class values to consecutive integers. The predictors should be numeric.

artificial intelligence, decision tree learning, machine learning, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.64)

Add feedback

Generalized and Scalable Optimal Sparse Decision Trees

Lin, Jimmy, Zhong, Chudi, Hu, Diane, Rudin, Cynthia, Seltzer, Margo

arXiv.org Machine LearningAug-10-2020

Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have allowed practical algorithms to find optimal decision trees. These new techniques have the potential to trigger a paradigm shift where it is possible to construct sparse decision trees to efficiently optimize a variety of objective functions without relying on greedy splitting and pruning heuristics that often lead to suboptimal solutions. The contribution in this work is to provide a general framework for decision tree optimization that addresses the two significant open problems in the area: treatment of imbalanced data and fully optimizing over continuous variables. We present techniques that produce optimal decision trees over a variety of objectives including F-score, AUC, and partial area under the ROC convex hull. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables and speeds up decision tree construction by several orders of magnitude relative to the state-of-the art.

artificial intelligence, leaves, machine learning, (15 more...)

arXiv.org Machine Learning

2006.0869

Country:

North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report (0.63)
Workflow (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Modeling of time series using random forests: theoretical developments

Davis, Richard A., Nielsen, Mikkel S.

arXiv.org Machine LearningAug-6-2020

Random forests, originally introduced by Breiman [8], constitute an ensemble learning algorithm for classification and regression, which produces predictions by first growing a large number of randomized decision trees [9] and, then, aggregates the results. Since its introduction, the algorithm has been applied in various fields such as object recognition [25], bioinformatics [12], ecology [10, 22] and finance [15, 18], and the evidence is strong: with very little tuning, random forests are able to deliver a flexible tool for prediction which is fully comparable with other state-of-the-art algorithms. In fact, Howard and Bowles [17] claim that random forests have been the most successful general-purpose algorithm in recent times. While many successful applications indicate the wide applicability of random forests, only little theoretical work exists to support this impression.

artificial intelligence, machine learning, random forest, (19 more...)

arXiv.org Machine Learning

2008.02479

Country: North America > United States (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback