AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

End-to-end Learning of Deterministic Decision Trees

Hehn, Thomas, Hamprecht, Fred A.

arXiv.org Machine LearningDec-7-2017

Conventional decision trees have a number of favorable properties, including interpretability, a small computational footprint and the ability to learn from little training data. However, they lack a key quality that has helped fuel the deep learning revolution: that of being end-to-end trainable, and to learn from scratch those features that best allow to solve a given supervised learning problem. Recent work (Kontschieder 2015) has addressed this deficit, but at the cost of losing a main attractive trait of decision trees: the fact that each sample is routed along a small subset of tree nodes only. We here propose a model and Expectation-Maximization training scheme for decision trees that are fully probabilistic at train time, but after a deterministic annealing process become deterministic at test time. We also analyze the learned oblique split parameters on image datasets and show that Neural Networks can be trained at each split node. In summary, we present the first end-to-end learning scheme for deterministic decision trees and present results on par with or superior to published standard oblique decision tree algorithms.

artificial intelligence, decision tree, machine learning, (21 more...)

arXiv.org Machine Learning

1712.02743

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Gini-regularized Optimal Transport with an Application to Spatio-Temporal Forecasting

Roberts, Lucas, Razoumov, Leo, Su, Lin, Wang, Yuyang

arXiv.org Machine LearningDec-7-2017

Rapidly growing product lines and services require a finer-granularity forecast that considers geographic locales. However the open question remains, how to assess the quality of a spatio-temporal forecast? In this manuscript we introduce a metric to evaluate spatio-temporal forecasts. This metric is based on an Opti- mal Transport (OT) problem. The metric we propose is a constrained OT objec- tive function using the Gini impurity function as a regularizer. We demonstrate through computer experiments both the qualitative and the quantitative charac- teristics of the Gini regularized OT problem. Moreover, we show that the Gini regularized OT problem converges to the classical OT problem, when the Gini regularized problem is considered as a function of {\lambda}, the regularization parame-ter. The convergence to the classical OT solution is faster than the state-of-the-art Entropic-regularized OT[Cuturi, 2013] and results in a numerically more stable algorithm.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

arXiv.org Machine Learning

1712.02512

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Kernel clustering: density biases and solutions

Marin, Dmitrii, Tang, Meng, Ayed, Ismail Ben, Boykov, Yuri

arXiv.org Machine LearningDec-6-2017

Kernel methods are popular in clustering due to their generality and discriminating power. However, we show that many kernel clustering criteria have density biases theoretically explaining some practically significant artifacts empirically observed in the past. For example, we provide conditions and formally prove the density mode isolation bias in kernel K-means for a common class of kernels. We call it Breiman's bias due to its similarity to the histogram mode isolation previously discovered by Breiman in decision tree learning with Gini impurity. We also extend our analysis to other popular kernel clustering methods, e.g. average/normalized cut or dominant sets, where density biases can take different forms. For example, splitting isolated points by cut-based criteria is essentially the sparsest subset bias, which is the opposite of the density mode bias. Our findings suggest that a principled solution for density biases in kernel clustering should directly address data inhomogeneity. We show that density equalization can be implicitly achieved using either locally adaptive weights or locally adaptive kernels. Moreover, density equalization makes many popular kernel clustering objectives equivalent. Our synthetic and real data experiments illustrate density biases and proposed solutions. We anticipate that theoretical understanding of kernel clustering limitations and their principled solutions will be important for a broad spectrum of data analysis applications across the disciplines.

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2017.2780166

1705.0595

Country: North America > Canada > Quebec (0.28)

Genre: Research Report > New Finding (0.54)

Industry:

Education > Educational Setting > Higher Education (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Decision Tree: Power BI- Part 2

@machinelearnbotDec-2-2017, 10:10:17 GMT

In the last Part, I have talked about the main concepts behind the Decision Tree. In this post, I will show how to use decision tree component in Power BI with the aim of Predictive analysis in the report. Decision tree able to handle both. There is a Hello world dataset in Data science world name "Titanic". This dataset has information about the passengers who survived or not from the disaster.

artificial intelligence, decision tree learning, machine learning, (8 more...)

@machinelearnbot

Industry: Transportation > Passenger (0.39)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Supervised Learning – Using Decision Trees to Classify Data

@machinelearnbotDec-1-2017, 03:10:06 GMT

One challenge of neural or deep architectures is that it is difficult to determine what exactly is going on in the machine learning algorithm that makes a classifier decide how to classify inputs. This is a huge problem in deep learning: we can get fantastic classification accuracies, but we don't really know what criteria a classifier uses to make its classification decision. However, decision trees can present us with a graphical representation of how the classifier reaches its decision. We'll be discussing the CART (Classification and Regression Trees) framework, which creates decision trees. First, we'll introduce the concept of decision trees, then we'll discuss each component of the CART framework to better understand how decision trees are generated. Before discussing decision trees, we should first get comfortable with trees, specifically binary trees.

artificial intelligence, decision tree learning, machine learning, (18 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Controlling machine-learning algorithms and their biases

#artificialintelligenceNov-30-2017, 10:00:30 GMT

Myths aside, artificial intelligence is as prone to bias as the human kind. The good news is that the biases in algorithms can also be diagnosed and treated. Companies are moving quickly to apply machine learning to business decision making. New programs are constantly being launched, setting complex algorithms to work on large, frequently refreshed data sets. The speed at which this is taking place attests to the attractiveness of the technology, but the lack of experience creates real risks. Algorithmic bias is one of the biggest risks because it compromises the very purpose of machine learning. This often-overlooked defect can trigger costly errors and, left unchecked, can pull projects and organizations in entirely wrong directions.

algorithm, artificial intelligence, machine learning, (16 more...)

#artificialintelligence

Industry:

Banking & Finance (1.00)
Health & Medicine > Consumer Health (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.48)

Add feedback

Who wins the Miss Contest for Imputation Methods? Our Vote for Miss BooPF

Ramosaj, Burim, Pauly, Markus

arXiv.org Machine LearningNov-30-2017

Missing data is an expected issue when large amounts of data is collected, and several imputation techniques have been proposed to tackle this problem. Beneath classical approaches such as MICE, the application of Machine Learning techniques is tempting. Here, the recently proposed missForest imputation method has shown high imputation accuracy under the Missing (Completely) at Random scheme with various missing rates. In its core, it is based on a random forest for classification and regression, respectively. In this paper we study whether this approach can even be enhanced by other methods such as the stochastic gradient tree boosting method, the C5.0 algorithm or modified random forest procedures. In particular, other resampling strategies within the random forest protocol are suggested. In an extensive simulation study, we analyze their performances for continuous, categorical as well as mixed-type data. Therein, MissBooPF, a combination of the stochastic gradient tree boosting method together with the parametrically bootstrapped random forest method, appeared to be promising. Finally, an empirical analysis focusing on credit information and Facebook data is conducted.

artificial intelligence, machine learning, stochastic gradient tree, (16 more...)

arXiv.org Machine Learning

1711.11394

Country: Europe (0.68)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)

Add feedback

Learning Certifiably Optimal Rule Lists for Categorical Data

Angelino, Elaine, Larus-Stone, Nicholas, Alabi, Daniel, Seltzer, Margo, Rudin, Cynthia

arXiv.org Machine LearningNov-30-2017

We present the design and implementation of a custom discrete optimization technique for building rule lists over a categorical feature space. Our algorithm produces rule lists with optimal training performance, according to the regularized empirical risk, with a certificate of optimality. By leveraging algorithmic bounds, efficient data structures, and computational reuse, we achieve several orders of magnitude speedup in time and a massive reduction of memory consumption. We demonstrate that our approach produces optimal rule lists on practical problems in seconds. Our results indicate that it is possible to construct optimal sparse rule lists that are approximately as accurate as the COMPAS proprietary risk prediction tool on data from Broward County, Florida, but that are completely interpretable. This framework is a novel alternative to CART and other decision tree methods for interpretable modeling.

artificial intelligence, machine learning, rule list, (21 more...)

arXiv.org Machine Learning

1704.01701

Country:

North America > United States > California (0.45)
North America > United States > Florida > Broward County (0.24)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.45)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government (0.68)
Law > Criminal Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(3 more...)

Add feedback

Forest-based methods and ensemble model output statistics for rainfall ensemble forecasting

Taillardat, Maxime, Fougères, Anne-Laure, Naveau, Philippe, Mestre, Olivier

arXiv.org Machine LearningNov-29-2017

Rainfall ensemble forecasts have to be skillful for both low precipitation and extreme events. We present statistical post-processing methods based on Quantile Regression Forests (QRF) and Gradient Forests (GF) with a parametric extension for heavy-tailed distributions. Our goal is to improve ensemble quality for all types of precipitation events, heavy-tailed included, subject to a good overall performance. Our hybrid proposed methods are applied to daily 51-h forecasts of 6-h accumulated precipitation from 2012 to 2015 over France using the M{\'e}t{\'e}o-France ensemble prediction system called PEARP. They provide calibrated pre-dictive distributions and compete favourably with state-of-the-art methods like Analogs method or Ensemble Model Output Statistics. In particular, hybrid forest-based procedures appear to bring an added value to the forecast of heavy rainfall.

artificial intelligence, decision tree learning, machine learning, (14 more...)

arXiv.org Machine Learning

1711.10937

Country: Europe > France (0.55)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Renewable (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Decision Trees -- OpenCV 2.4.13.4 documentation

#artificialintelligenceNov-26-2017, 23:45:15 GMT

To reach a leaf node and to obtain a response for the input feature vector, the prediction procedure starts with the root node. From each non-leaf node the procedure goes to the left (selects the left child node as the next observed node) or to the right based on the value of a certain variable whose index is stored in the observed node. So, in each node, a pair of entities (variable_index, decision_rule (threshold/subset)) is used. This pair is called a split (split on the variable variable_index). Once a leaf node is reached, the value assigned to this node is used as the output of the prediction procedure.

artificial intelligence, machine learning, node, (6 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.45)

Add feedback