AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Interactive Graphics for Visually Diagnosing Forest Classifiers in R

da Silva, Natalia, Cook, Dianne, Lee, Eun-Kyung

arXiv.org Machine LearningApr-8-2017

This paper describes structuring data and constructing plots to explore forest classification models interactively. A forest classifier is an example of an ensemble, produced by bagging multiple trees. The process of bagging and combining results from multiple trees, produces numerous diagnostics which, with interactive graphics, can provide a lot of insight into class structure in high dimensions. Various aspects are explored in this paper, to assess model complexity, individual model contributions, variable importance and dimension reduction, and uncertainty in prediction associated with individual observations. The ideas are applied to the random forest algorithm, and to the projection pursuit forest, but could be more broadly applied to other bagged ensembles. Interactive graphics are built in R, using the ggplot2, plotly, and shiny packages.

artificial intelligence, machine learning, node, (20 more...)

arXiv.org Machine Learning

1704.02502

Country: North America > United States (0.68)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.67)

Add feedback

Brief History of Machine Learning – Chatbot News Daily

#artificialintelligenceApr-6-2017, 23:29:38 GMT

Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Res.

artificial intelligence, decision tree learning, machine learning, (19 more...)

#artificialintelligence

Country:

North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Instructional Material (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Scalable Bayesian Rule Lists

Yang, Hongyu, Rudin, Cynthia, Seltzer, Margo

arXiv.org Artificial IntelligenceApr-3-2017

We present an algorithm for building probabilistic rule lists that is two orders of magnitude faster than previous work. Rule list algorithms are competitors for decision tree algorithms. They are associative classifiers, in that they are built from pre-mined association rules. They have a logical structure that is a sequence of IF-THEN rules, identical to a decision list or one-sided decision tree. Instead of using greedy splitting and pruning like decision tree algorithms, we fully optimize over rule lists, striking a practical balance between accuracy, interpretability, and computational speed. The algorithm presented here uses a mixture of theoretical bounds (tight enough to have practical implications as a screening or bounding procedure), computational reuse, and highly tuned language libraries to achieve computational efficiency. Currently, for many practical problems, this method achieves better accuracy and sparsity than decision trees; further, in many cases, the computational time is practical and often less than that of decision trees. The result is a probabilistic classifier (which estimates P(y = 1|x) for each x) that optimizes the posterior of a Bayesian hierarchical model over rule lists.

artificial intelligence, machine learning, rule list, (17 more...)

arXiv.org Artificial Intelligence

1602.0861

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)
(2 more...)

Add feedback

On the Reliable Detection of Concept Drift from Streaming Unlabeled Data

Sethi, Tegjyot Singh, Kantardzic, Mehmed

arXiv.org Machine LearningMar-31-2017

Classifiers deployed in the real world operate in a dynamic environment, where the data distribution can change over time. These changes, referred to as concept drift, can cause the predictive performance of the classifier to drop over time, thereby making it obsolete. To be of any real use, these classifiers need to detect drifts and be able to adapt to them, over time. Detecting drifts has traditionally been approached as a supervised task, with labeled data constantly being used for validating the learned model. Although effective in detecting drifts, these techniques are impractical, as labeling is a difficult, costly and time consuming activity. On the other hand, unsupervised change detection techniques are unreliable, as they produce a large number of false alarms. The inefficacy of the unsupervised techniques stems from the exclusion of the characteristics of the learned classifier, from the detection process. In this paper, we propose the Margin Density Drift Detection (MD3) algorithm, which tracks the number of samples in the uncertainty region of a classifier, as a metric to detect drift. The MD3 algorithm is a distribution independent, application independent, model independent, unsupervised and incremental algorithm for reliably detecting drifts from data streams. Experimental evaluation on 6 drift induced datasets and 4 additional datasets from the cybersecurity domain demonstrates that the MD3 approach can reliably detect drifts, with significantly fewer false alarms compared to unsupervised feature based drift detectors. The reduced false alarms enables the signaling of drifts only when they are most likely to affect classification performance. As such, the MD3 approach leads to a detection scheme which is credible, label efficient and general in its applicability.

artificial intelligence, decision tree learning, machine learning, (19 more...)

arXiv.org Machine Learning

1704.00023

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.93)

Add feedback

Random Forests for Big Data

Genuer, Robin, Poggi, Jean-Michel, Tuleau-Malot, Christine, Villa-Vialaneix, Nathalie

arXiv.org Machine LearningMar-22-2017

Big Data is one of the major challenges of statistical science and has numerous consequences from algorithmic and theoretical viewpoints. Big Data always involve massive data but they also often include online data and data heterogeneity. Recently some statistical methods have been adapted to process Big Data, like linear regression models, clustering methods and bootstrapping schemes. Based on decision trees combined with aggregation and bootstrap ideas, random forests were introduced by Breiman in 2001. They are a powerful nonparametric statistical method allowing to consider in a single and versatile framework regression problems, as well as two-class and multi-class classification problems. Focusing on classification problems, this paper proposes a selective review of available proposals that deal with scaling random forests to Big Data problems. These proposals rely on parallel environments or on online adaptations of random forests. We also describe how related quantities -- such as out-of-bag error and variable importance -- are addressed in these methods. Then, we formulate various remarks for random forests in the Big Data context. Finally, we experiment five variants on two massive datasets (15 and 120 millions of observations), a simulated one as well as real world data. One variant relies on subsampling while three others are related to parallel implementations of random forests and involve either various adaptations of bootstrap to Big Data or to "divide-and-conquer" approaches. The fifth variant relates on online learning of random forests. These numerical experiments lead to highlight the relative performance of the different variants, as well as some of their limitations.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1511.08327

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

Add feedback

An investigation into machine learning approaches for forecasting spatio-temporal demand in ride-hailing service

Saadi, Ismaïl, Wong, Melvin, Farooq, Bilal, Teller, Jacques, Cools, Mario

arXiv.org Machine LearningMar-7-2017

We propose the spatiotemporal estimation of the demand that is a function of variable effects related to traffic, pricing and weather conditions. With respect to the methodology, a single decision tree, bootstrap-aggregated (bagged) decision trees, random forest, boosted decision trees, and artificial neural network for regression have been adapted and systematically compared using various statistics, e.g. R-square, Root Mean Square Error (RMSE), and slope. To better assess the quality of the models, they have been tested on a real case study using the data of DiDi Chuxing, the main on-demand ride-hailing service provider in China. In the current study, 199,584 time-slots describing the spatiotemporal ride-hailing demand has been extracted with an aggregated-time interval of 10 mins. All the methods are trained and validated on the basis of two independent samples from this dataset. The results revealed that boosted decision trees provide the best prediction accuracy (RMSE 16.41), while avoiding the risk of over-fitting, followed by artificial neural network (20.09), random forest (23.50), bagged decision trees (24.29) and single decision tree (33.55).

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Machine Learning

1703.02433

Country:

North America > Canada (0.28)
Europe > Belgium (0.28)
Asia > China (0.25)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Structural Data Recognition with Graph Model Boosting

Miyazaki, Tomo, Omachi, Shinichiro

arXiv.org Machine LearningMar-7-2017

This paper presents a novel method for structural data recognition using a large number of graph models. In general, prevalent methods for structural data recognition have two shortcomings: 1) Only a single model is used to capture structural variation. 2) Naive recognition methods are used, such as the nearest neighbor method. In this paper, we propose strengthening the recognition performance of these models as well as their ability to capture structural variation. The proposed method constructs a large number of graph models and trains decision trees using the models. This paper makes two main contributions. The first is a novel graph model that can quickly perform calculations, which allows us to construct several models in a feasible amount of time. The second contribution is a novel approach to structural data recognition: graph model boosting. Comprehensive structural variations can be captured with a large number of graph models constructed in a boosting framework, and a sophisticated classifier can be formed by aggregating the decision trees. Consequently, we can carry out structural data recognition with powerful recognition capability in the face of comprehensive structural variation. The experiments shows that the proposed method achieves impressive results and outperforms existing methods on datasets of IAM graph database repository.

artificial intelligence, decision tree learning, machine learning, (15 more...)

arXiv.org Machine Learning

1703.02662

Country: Asia > Japan (0.15)

Genre: Research Report > Promising Solution (0.54)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.87)

Add feedback

Neural Decision Trees

Balestriero, Randall

arXiv.org Machine LearningMar-6-2017

In this paper we propose a synergistic melting of neural networks and decision trees (DT) we call neural decision trees (NDT). NDT is an architecture a la decision tree where each splitting node is an independent multilayer perceptron allowing oblique decision functions or arbritrary nonlinear decision function if more than one layer is used. This way, each MLP can be seen as a node of the tree. We then show that with the weight sharing asumption among those units, we end up with a Hashing Neural Network (HNN) which is a multilayer perceptron with sigmoid activation function for the last layer as opposed to the standard softmax. The output units then jointly represent the probability to be in a particular region. The proposed framework allows for global optimization as opposed to greedy in DT and differentiability w.r.t. all parameters and the input, allowing easy integration in any learnable pipeline, for example after CNNs for computer vision tasks. We also demonstrate the modeling power of HNN allowing to learn union of disjoint regions for final clustering or classification making it more general and powerful than standard softmax MLP requiring linear separability thus reducing the need on the inner layer to perform complex data transformations. We finally show experiments for supervised, semi-suppervised and unsupervised tasks and compare results with standard DTs and MLPs.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Machine Learning

1702.0736

Country: North America > United States > California (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Indoor Localization by Fusing a Group of Fingerprints Based on Random Forests

Guo, Xiansheng, Ansari, Nirwan, Li, Huiyong

arXiv.org Machine LearningMar-6-2017

Indoor localization based on SIngle Of Fingerprint (SIOF) is rather susceptible to the changing environment, multipath, and non-line-of-sight (NLOS) propagation. Building SIOF is also a very time-consuming process. Recently, we first proposed a GrOup Of Fingerprints (GOOF) to improve the localization accuracy and reduce the burden of building fingerprints. However, the main drawback is the timeliness. In this paper, we propose a novel localization framework by Fusing A Group Of fingerprinTs (FAGOT) based on random forests. In the offline phase, we first build a GOOF from different transformations of the received signals of multiple antennas. Then, we design multiple GOOF strong classifiers based on Random Forests (GOOF-RF) by training each fingerprint in the GOOF. In the online phase, we input the corresponding transformations of the real measurements into these strong classifiers to obtain multiple independent decisions. Finally, we propose a Sliding Window aIded Mode-based (SWIM) fusion algorithm to balance the localization accuracy and time. Our proposed approaches can work better in an unknown indoor scenario. The burden of building fingerprints can also be reduced drastically. We demonstrate the performance of our algorithms through simulations and real experimental data using two Universal Software Radio Peripheral (USRP) platforms.

artificial intelligence, decision tree learning, machine learning, (20 more...)

arXiv.org Machine Learning

1703.02185

Genre: Research Report (0.82)

Industry: Health & Medicine > Health Care Technology > Telehealth (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.84)

Add feedback

Day05: Recognition with random forest

#artificialintelligenceMar-2-2017, 07:51:22 GMT

Day05 brings another Kaggle competition, "Digit Recognizer". The goal in this competition is to take an image of a handwritten single digit, and determine what that digit is. The Jupyter Notebook for this little project is found here. Today, I think I'll use a random forest classifier. A random forest classifier is like an election where the outcome with the most votes (from the ensemble of decision trees) is the predicted classification.

artificial intelligence, machine learning, random forest, (6 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.95)

Add feedback