AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Mondrian Forests: Efficient Online Random Forests

Lakshminarayanan, Balaji, Roy, Daniel M., Teh, Yee Whye

Neural Information Processing SystemsFeb-14-2020, 11:55:47 GMT

Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as Breiman's random forest and extremely randomized trees) operate on batches of training data. Online methods are now in greater demand. Existing online random forests, however, require more training data than their batch counterpart to achieve comparable predictive performance.

decision tree, efficient online random forest, mondrian forest, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Pruning Random Forests for Prediction on a Budget

Nan, Feng, Wang, Joseph, Saligrama, Venkatesh

Neural Information Processing SystemsFeb-14-2020, 11:12:41 GMT

We propose to prune a random forest (RF) for resource-constrained prediction. We first construct a RF and then prune it to optimize expected feature cost & accuracy. We pose pruning RFs as a novel 0-1 integer program with linear constraints that encourages feature re-use. We establish total unimodularity of the constraint set to prove that the corresponding LP relaxation solves the original integer program. We then exploit connections to combinatorial optimization and develop an efficient primal-dual algorithm, scalable to large datasets.

integer program, prediction, pruning random forest, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.67)

Add feedback

Efficient Non-greedy Optimization of Decision Trees

Norouzi, Mohammad, Collins, Maxwell, Johnson, Matthew A., Fleet, David J., Kohli, Pushmeet

Neural Information Processing SystemsFeb-14-2020, 09:57:54 GMT

Decision trees and randomized forests are widely used in computer vision and machine learning. This greedy procedure often leads to suboptimal trees. In this paper, we present an algorithm for optimizing the split functions at all levels of the tree jointly with the leaf parameters, based on a global objective. We show that the problem of finding optimal linear-combination (oblique) splits for decision trees is related to structured prediction with latent variables, and we formulate a convex-concave upper bound on the tree's empirical loss. Computing the gradient of the proposed surrogate objective with respect to each training exemplar is O(d 2), where d is the tree depth, and thus training deep trees is feasible.

algorithm, decision tree, efficient non-greedy optimization, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

A Communication-Efficient Parallel Algorithm for Decision Tree

Meng, Qi, Ke, Guolin, Wang, Taifeng, Chen, Wei, Ye, Qiwei, Ma, Zhi-Ming, Liu, Tie-Yan

Neural Information Processing SystemsFeb-14-2020, 08:14:29 GMT

Decision tree (and its extensions such as Gradient Boosting Decision Trees and Random Forest) is a widely used machine learning algorithm, due to its practical effectiveness and model interpretability. With the emergence of big data, there is an increasing need to parallelize the training process of decision tree. However, most existing attempts along this line suffer from high communication costs. In this paper, we propose a new algorithm, called \emph{Parallel Voting Decision Tree (PV-Tree)}, to tackle this challenge. After partitioning the training data onto a number of (e.g., $M$) machines, this algorithm performs both local voting and global voting in each iteration. For local voting, the top-$k$ attributes are selected from each machine according to its local data.

communication cost, communication-efficient parallel algorithm, decision tree, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Alternating optimization of decision trees, with application to learning sparse oblique trees

Carreira-Perpinan, Miguel A., Tavallali, Pooya

Neural Information Processing SystemsFeb-14-2020, 07:42:40 GMT

Learning a decision tree from data is a difficult optimization problem. The most widespread algorithm in practice, dating to the 1980s, is based on a greedy growth of the tree structure by recursively splitting nodes, and possibly pruning back the final tree. The parameters (decision function) of an internal node are approximately estimated by minimizing an impurity measure. We give an algorithm that, given an input tree (its structure and the parameter values at its nodes), produces a new tree with the same or smaller structure but new parameter values that provably lower or leave unchanged the misclassification error. This can be applied to both axis-aligned and oblique trees and our experiments show it consistently outperforms various other algorithms while being highly scalable to large datasets and trees.

alternating optimization, decision tree, sparse oblique tree, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.65)

Add feedback

Variable Importance Using Decision Trees

Kazemitabar, Jalil, Amini, Arash, Bloniarz, Adam, Talwalkar, Ameet S.

Neural Information Processing SystemsFeb-14-2020, 05:42:35 GMT

Decision trees and random forests are well established models that not only offer good predictive performance, but also provide rich feature importance information. While practitioners often employ variable importance methods that rely on this impurity-based information, these methods remain poorly characterized from a theoretical perspective. We provide novel insights into the performance of these methods by deriving finite sample performance guarantees in a high-dimensional setting under various modeling assumptions. We further demonstrate the effectiveness of these impurity-based methods via an extensive set of simulations. Papers published at the Neural Information Processing Systems Conference.

decision tree, information, variable importance

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.69)

Add feedback

Learning to rank for uplift modeling

Devriendt, Floris, Guns, Tias, Verbeke, Wouter

arXiv.org Machine LearningFeb-14-2020

Uplift modeling has effectively been used in fields such as marketing and customer retention, to target those customers that are most likely to respond due to the campaign or treatment. Uplift models produce uplift scores which are then used to essentially create a ranking. We instead investigate to learn to rank directly by looking into the potential of learning-to-rank techniques in the context of uplift modeling. We propose a unified formalisation of different global uplift modeling measures in use today and explore how these can be integrated into the learning-to-rank framework. Additionally, we introduce a new metric for learning-to-rank that focusses on optimizing the area under the uplift curve called the promoted cumulative gain (PCG). We employ the learning-to-rank technique LambdaMART to optimize the ranking according to PCG and show improved results over standard learning-to-rank metrics and equal to improved results when compared with state-of-the-art uplift modeling. Finally, we show how learning-to-rank models can learn to optimize a certain targeting depth, however, these results do not generalize on the test set.

cumulative incremental gain, cumulative percentage, uplift curve, (14 more...)

arXiv.org Machine Learning

2002.05897

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Belgium (0.04)
North America > United States > District of Columbia > Washington (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Marketing (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)

Add feedback

Learn to Expect the Unexpected: Probably Approximately Correct Domain Generalization

Garg, Vikas K., Kalai, Adam, Ligett, Katrina, Wu, Zhiwei Steven

arXiv.org Machine LearningFeb-13-2020

Domain generalization is the problem of machine learning when the training data and the test data come from different data domains. We present a simple theoretical model of learning to generalize across domains in which there is a meta-distribution over data distributions, and those data distributions may even have different supports. In our model, the training data given to a learning algorithm consists of multiple datasets each from a single domain drawn in turn from the meta-distribution. We study this model in three different problem settings---a multi-domain Massart noise setting, a decision tree multi-dataset setting, and a feature selection setting, and find that computationally efficient, polynomial-sample domain generalization is possible in each. Experiments demonstrate that our feature selection algorithm indeed ignores spurious correlations and improves generalization.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2002.0566

Country:

North America > United States > Wyoming (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Texas (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Bagging and Random Forest for Imbalanced Classification

#artificialintelligenceFeb-12-2020, 06:05:36 GMT

Bagging is an ensemble algorithm that fits multiple models on different subsets of a training dataset, then combines the predictions from all models. Random forest is an extension of bagging that also randomly selects subsets of features used in each data sample. Both bagging and random forests have proven effective on a wide range of different predictive modeling problems. Although effective, they are not suited to classification problems with a skewed class distribution. Nevertheless, many modifications to the algorithms have been proposed that adapt their behavior and make them better suited to a severe class imbalance. In this tutorial, you will discover how to use bagging and random forest for imbalanced classification.

algorithm, dataset, majority class, (14 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

dtControl: Decision Tree Learning Algorithms for Controller Representation

Ashok, Pranav, Jackermeier, Mathias, Jagtap, Pushpak, Křetínský, Jan, Weininger, Maximilian, Zamani, Majid

arXiv.org Artificial IntelligenceFeb-12-2020

Decision tree learning is a popular classification technique most commonly used in machine learning applications. Recent work has shown that decision trees can be used to represent provably-correct controllers concisely. Compared to representations using lookup tables or binary decision diagrams, decision trees are smaller and more explainable. We present dtControl, an easily extensible tool for representing memoryless controllers as decision trees. We give a comprehensive evaluation of various decision tree learning algorithms applied to 10 case studies arising out of correct-by-construction controller synthesis. These algorithms include two new techniques, one for using arbitrary linear binary classifiers in the decision tree learning, and one novel approach for determinizing controllers during the decision tree construction. In particular the latter turns out to be extremely efficient, yielding decision trees with a single-digit number of decision nodes on 5 of the case studies.

artificial intelligence, controller, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2002.04991

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
North America > United States > Florida > Miami-Dade County > Miami Beach (0.04)
North America > United States > Colorado > Boulder County > Boulder (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback