AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Crossbreeding in Random Forest

Nadi, Abolfazl, Moradi, Hadi, Taheri, Khalil

arXiv.org Artificial IntelligenceJan-21-2021

Ensemble learning methods are designed to benefit from multiple learning algorithms for better predictive performance. The tradeoff of this improved performance is slower speed and larger size of ensemble learning systems compared to single learning systems. In this paper, we present a novel approach to deal with this problem in Random Forest (RF) as one of the most powerful ensemble methods. The method is based on crossbreeding of the best tree branches to increase the performance of RF in space and speed while keeping the performance in the classification measures. The proposed approach has been tested on a group of synthetic and real datasets and compared to the standard RF approach. Several evaluations have been conducted to determine the effects of the Crossbred RF (CRF) on the accuracy and the number of trees in a forest. The results show better performance of CRF compared to RF.

application, classifier, random forest, (13 more...)

arXiv.org Artificial Intelligence

2101.08585

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Portugal > Braga > Braga (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
(12 more...)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.92)

Add feedback

Dive into Decision Trees and Forests: A Theoretical Demonstration

Zhang, Jinxiong

arXiv.org Machine LearningJan-20-2021

Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and labels into smaller ones. While decision trees have a long history, recent advances have greatly improved their performance in computational advertising, recommender system, information retrieval, etc. We introduce common tree-based models (e.g., Bayesian CART, Bayesian regression splines) and training techniques (e.g., mixed integer programming, alternating optimization, gradient descent). Along the way, we highlight probabilistic characteristics of tree-based models and explain their practical and theoretical benefits. Except machine learning and data mining, we try to show theoretical advances on tree-based models from other fields such as statistics and operation research. We list the reproducible resource at the end of each method.

decision tree and forest, dive, theoretical demonstration

arXiv.org Machine Learning

2101.08656

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

A Survey on the Explainability of Supervised Machine Learning

Burkart, Nadia (Fraunhofer IOSB) | Huber, Marco F. (Fraunhofer IPA, University of Stuttgart)

Journal of Artificial Intelligence ResearchJan-19-2021

Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or finance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.

arxiv preprint arxiv, explanation, prediction, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12228

AI Access Foundation

12228

Journal of Artificial Intelligence Research

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Portugal > Lisbon > Lisbon (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
(9 more...)

Add feedback

Learning Temporal Causal Sequence Relationships from Real-Time Time-Series

Bruto da Costa, Antonio Anastasio | Dasgupta, Pallab (Indian Institute of Technology Kharagpur)

Journal of Artificial Intelligence ResearchJan-18-2021

We aim to mine temporal causal sequences that explain observed events (consequents) in time-series traces. Causal explanations of key events in a time-series have applications in design debugging, anomaly detection, planning, root-cause analysis and many more. We make use of decision trees and interval arithmetic to mine sequences that explain defining events in the time-series. We propose modified decision tree construction metrics to handle the non-determinism introduced by the temporal dimension. The mined sequences are expressed in a readable temporal logic language that is easy to interpret. The application of the proposed methodology is illustrated through various examples.

constraint, predicate, sequence, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12395

AI Access Foundation

12395

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > India > West Bengal > Kharagpur (0.04)
(3 more...)

Genre: Research Report (0.45)

Industry: Transportation (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.56)

Add feedback

Interpretable Policy Specification and Synthesis through Natural Language and RL

Tambwekar, Pradyumna, Silva, Andrew, Gopalan, Nakul, Gombolay, Matthew

arXiv.org Artificial IntelligenceJan-18-2021

Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.

arxiv preprint arxiv, decision tree, natural language, (12 more...)

arXiv.org Artificial Intelligence

2101.0714

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Air (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

dtControl 2.0: Explainable Strategy Representation via Decision Tree Learning Steered by Experts

Ashok, Pranav, Jackermeier, Mathias, Křetínský, Jan, Weinhuber, Christoph, Weininger, Maximilian, Yadav, Mayank

arXiv.org Artificial IntelligenceJan-15-2021

Recent advances have shown how decision trees are apt data structures for concisely representing strategies (or controllers) satisfying various objectives. Moreover, they also make the strategy more explainable. The recent tool dtControl had provided pipelines with tools supporting strategy synthesis for hybrid systems, such as SCOTS and Uppaal Stratego. We present dtControl 2.0, a new version with several fundamentally novel features. Most importantly, the user can now provide domain knowledge to be exploited in the decision tree learning process and can also interactively steer the process based on the dynamically provided information. To this end, we also provide a graphical user interface. It allows for inspection and re-computation of parts of the result, suggesting as well as receiving advice on predicates, and visual simulation of the decision-making process. Besides, we interface model checkers of probabilistic systems, namely Storm and PRISM and provide dedicated support for categorical enumeration-type state variables. Consequently, the controllers are more explainable and smaller.

controller, impurity measure, predicate, (14 more...)

arXiv.org Artificial Intelligence

2101.07202

Country:

North America > Cuba (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Europe > Germany > Lower Saxony > Oldenburg (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Hyperboost: Hyperparameter Optimization by Gradient Boosting surrogate models

van Hoof, Jeroen, Vanschoren, Joaquin

arXiv.org Machine LearningJan-6-2021

Bayesian Optimization is a popular tool for tuning algorithms in automatic machine learning (AutoML) systems. Current state-of-the-art methods leverage Random Forests or Gaussian processes to build a surrogate model that predicts algorithm performance given a certain set of hyperparameter settings. In this paper, we propose a new surrogate model based on gradient boosting, where we use quantile regression to provide optimistic estimates of the performance of an unobserved hyperparameter setting, and combine this with a distance metric between unobserved and observed hyperparameter settings to help regulate exploration. We demonstrate empirically that the new method is able to outperform some state-of-the art techniques across a reasonable sized set of classification problems.

algorithm, configuration, optimization, (13 more...)

arXiv.org Machine Learning

2101.02289

Country:

North America > Canada > Alberta (0.14)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Pro Machine Learning Algorithms PDF

#artificialintelligenceJan-5-2021, 18:48:05 GMT

Bridge the gap between a high-level understanding of how an algorithm works and knowing the nuts and bolts to tune your models better. This book will give you confidence and skills when developing all the major machine learning models. In Pro Machine Learning Algorithms, you will first develop the algorithm in Excel so that you get a practical understanding of all the levers that can be tuned in a model, before implementing the models in Python/R. You will cover all the major algorithms: supervised and unsupervised learning, which include linear/logistic regression; k-means clustering; PCA; recommender system; decision tree; random forest; GBM; and neural networks. You will also be exposed to the latest in deep learning through CNNs, RNNs, and word2vec for text mining.

algorithm, learning, recommender system

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.62)

Add feedback

Growing Deep Forests Efficiently with Soft Routing and Learned Connectivity

Shen, Jianghao, Wang, Sicheng, Wang, Zhangyang

arXiv.org Machine LearningDec-29-2020

Despite the latest prevailing success of deep neural networks (DNNs), several concerns have been raised against their usage, including the lack of intepretability the gap between DNNs and other well-established machine learning models, and the growingly expensive computational costs. A number of recent works [1], [2], [3] explored the alternative to sequentially stacking decision tree/random forest building blocks in a purely feed-forward way, with no need of back propagation. Since decision trees enjoy inherent reasoning transparency, such deep forest models can also facilitate the understanding of the internaldecision making process. This paper further extends the deep forest idea in several important aspects. Firstly, we employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.Besides enhancing the flexibility, it also enables non-greedy optimization for each tree. Second, we propose an innovative topology learning strategy: every node in the ree now maintains a new learnable hyperparameter indicating the probability that it will be a leaf node. In that way, the tree will jointly optimize both its parameters and the tree topology during training. Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3] , with dramatically reduced model complexity. For example,our model with only 1 layer of 15 trees can perform comparably with the model in [3] with 2 layers of 2000 trees each.

decision tree, node, probability, (13 more...)

arXiv.org Machine Learning

2012.14878

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Scalable and Provably Accurate Algorithms for Differentially Private Distributed Decision Tree Learning

Wang, Kaiwen, Dick, Travis, Balcan, Maria-Florina

arXiv.org Machine LearningDec-19-2020

This paper introduces the first provably accurate algorithms for differentially private, top-down decision tree learning in the distributed setting (Balcan et al., 2012). We propose DP-TopDown, a general privacy preserving decision tree learning algorithm, and present two distributed implementations. Our first method NoisyCounts naturally extends the single machine algorithm by using the Laplace mechanism. Our second method LocalRNM significantly reduces communication and added noise by performing local optimization at each data holder. We provide the first utility guarantees for differentially private top-down decision tree learning in both the single machine and distributed settings. These guarantees show that the error of the privately-learned decision tree quickly goes to zero provided that the dataset is sufficiently large. Our extensive experiments on real datasets illustrate the trade-offs of privacy, accuracy and generalization when learning private decision trees in the distributed setting.

algorithm, decision tree, privatesplit, (16 more...)

arXiv.org Machine Learning

2012.10602

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback