AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale

Abuzaid, Firas, Bradley, Joseph K., Liang, Feynman T., Feng, Andrew, Yang, Lee, Zaharia, Matei, Talwalkar, Ameet S.

Neural Information Processing SystemsDec-31-2016

Deep distributed decision trees and tree ensembles have grown in importance due to the need to model increasingly large datasets. However, PLANET, the standard distributed tree learning algorithm implemented in systems such as \xgboost and Spark MLlib, scales poorly as data dimensionality and tree depths grow. We present Yggdrasil, a new distributed tree learning method that outperforms existing methods by up to 24x. Unlike PLANET, Yggdrasil is based on vertical partitioning of the data (i.e., partitioning by feature), along with a set of optimized data structures to reduce the CPU and communication costs of training. Yggdrasil (1) trains directly on compressed data for compressible features and labels; (2) introduces efficient data structures for training on uncompressed data; and (3) minimizes communication between nodes by using sparse bitvectors. Moreover, while PLANET approximates split points through feature binning, Yggdrasil does not require binning, and we analytically characterize the impact of this approximation. We evaluate Yggdrasil against the MNIST 8M dataset and a high-dimensional dataset at Yahoo; for both, Yggdrasil is faster by up to an order of magnitude.

artificial intelligence, machine learning, yggdrasil, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Pruning Random Forests for Prediction on a Budget

Nan, Feng, Wang, Joseph, Saligrama, Venkatesh

Neural Information Processing SystemsDec-31-2016

We propose to prune a random forest (RF) for resource-constrained prediction. We first construct a RF and then prune it to optimize expected feature cost & accuracy. We pose pruning RFs as a novel 0-1 integer program with linear constraints that encourages feature re-use. We establish total unimodularity of the constraint set to prove that the corresponding LP relaxation solves the original integer program. We then exploit connections to combinatorial optimization and develop an efficient primal-dual algorithm, scalable to large datasets. In contrast to our bottom-up approach, which benefits from good RF initialization, conventional methods are top-down acquiring features based on their utility value and is generally intractable, requiring heuristics. Empirically, our pruning algorithm outperforms existing state-of-the-art resource-constrained algorithms.

artificial intelligence, constraint, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.85)

Add feedback

A Communication-Efficient Parallel Algorithm for Decision Tree

Meng, Qi, Ke, Guolin, Wang, Taifeng, Chen, Wei, Ye, Qiwei, Ma, Zhi-Ming, Liu, Tie-Yan

Neural Information Processing SystemsDec-31-2016

Decision tree (and its extensions such as Gradient Boosting Decision Trees and Random Forest) is a widely used machine learning algorithm, due to its practical effectiveness and model interpretability. With the emergence of big data, there is an increasing need to parallelize the training process of decision tree. However, most existing attempts along this line suffer from high communication costs. In this paper, we propose a new algorithm, called \emph{Parallel Voting Decision Tree (PV-Tree)}, to tackle this challenge. After partitioning the training data onto a number of (e.g., $M$) machines, this algorithm performs both local voting and global voting in each iteration. For local voting, the top-$k$ attributes are selected from each machine according to its local data. Then, the indices of these top attributes are aggregated by a server, and the globally top-$2k$ attributes are determined by a majority voting among these local candidates. Finally, the full-grained histograms of the globally top-$2k$ attributes are collected from local machines in order to identify the best (most informative) attribute and its split point. PV-Tree can achieve a very low communication cost (independent of the total number of attributes) and thus can scale out very well. Furthermore, theoretical analysis shows that this algorithm can learn a near optimal decision tree, since it can find the best attribute with a large probability. Our experiments on real-world datasets show that PV-Tree significantly outperforms the existing parallel decision tree algorithms in the tradeoff between accuracy and efficiency.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Winning Kaggle 101: Introduction to Stacking

#artificialintelligenceDec-29-2016, 22:00:42 GMT

Random Forest) • Used to ensemble a diverse group of strong learners • Involves training a second-level machine learning algorithm called a "metalearner" to learn the optimal combination of the base learners 5. History of Stacking • Leo Breiman, "Stacked Regressions" (1996) • Modified algorithm to use CV to generate level-one data • Blended Neural Networks and GLMs (separately) Stacked Generalization Stacked Regressions Super Learning • David H. Wolpert, "Stacked Generalization" (1992) • First formulation of stacking via a metalearner • Blended Neural Networks • Mark van der Laan et al., "Super Learner" (2007) • Provided the theory to prove that the Super Learner is the asymptotically optimal combination • First R implementation in 2010 6.

algorithm, artificial intelligence, machine learning, (12 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > Mountain View (0.06)

Industry: Education > Educational Setting (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.38)

Add feedback

Introduction to Classification & Regression Trees (CART)

@machinelearnbotDec-29-2016, 11:10:06 GMT

Decision Trees are commonly used in data mining with the objective of creating a model that predicts the value of a target (or dependent variable) based on the values of several input (or independent variables). In today's post, we discuss the CART decision tree methodology. The CART or Classification & Regression Trees methodology was introduced in 1984 by Leo Breiman, Jerome Friedman, Richard Olshen and Charles Stone as an umbrella term to refer to the following types of decision trees: Classification Trees: where the target variable is categorical and the tree is used to identify the "class" within which a target variable would likely fall into. Regression Trees: where the target variable is continuous and tree is used to predict it's value. The CART algorithm is structured as a sequence of questions, the answers to which determine what the next question, if any should be.

artificial intelligence, decision tree learning, machine learning, (16 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

5 Machine Learning Research Studies To Understand & Predict Length of Stay in Hospitals

@machinelearnbotDec-28-2016, 21:40:02 GMT

Length of Stay (LOS) is a critical factor in managing hospital quality & economic outcomes in Healthcare. The metric is calculated by summing the total number of days for all discharges & dividing it by the total number of discharges. Insurance programs such as Medicare are moving to a model where they are compensating Hospitals the same amount for a specific surgery (e.g. Joint replacement) regardless of the number of days spent in the hospital. Therefore, hospitals & the overall healthcare ecosystem are motivated to reduce LOS.

artificial intelligence, hospital, machine learning, (16 more...)

@machinelearnbot

Country: North America > United States > Pennsylvania (0.05)

Industry: Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.32)

Add feedback

Top Algorithms and Methods Used by Data Scientists

#artificialintelligenceDec-28-2016, 15:15:26 GMT

Latest KDnuggets poll identifies the list of top algorithms actually used by Data Scientists, finds surprises including the most academic and most industry-oriented algorithms. Latest KDnuggets Poll asked Which methods/algorithms you used in the past 12 months for an actual Data Science-related application? . Here are the results, based on 844 voters. The top 10 algorithms (and methods) and their share of voters are: Figure 1: Top 10 algorithms & methods used by Data Scientists. See full table of all algorithms and methods at the end of the post.

artificial intelligence, data scientist, machine learning, (13 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.52)

Add feedback

ggRandomForests: Exploring Random Forest Survival

Ehrlinger, John

arXiv.org Machine LearningDec-28-2016

Random forest (Leo Breiman 2001a) (RF) is a non-parametric statistical method requiring no distributional assumptions on covariate relation to the response. RF is a robust, nonlinear technique that optimizes predictive accuracy by fitting an ensemble of trees to stabilize model estimates. Random survival forests (RSF) (Ishwaran and Kogalur 2007; Ishwaran et al. 2008) are an extension of Breimans RF techniques allowing efficient nonparametric analysis of time to event data. The randomForestSRC package (Ishwaran and Kogalur 2014) is a unified treatment of Breimans random forest for survival, regression and classification problems. Predictive accuracy makes RF an attractive alternative to parametric models, though complexity and interpretability of the forest hinder wider application of the method. We introduce the ggRandomForests package, tools for visually understand random forest models grown in R (R Core Team 2014) with the randomForestSRC package. The ggRandomForests package is structured to extract intermediate data objects from randomForestSRC objects and generate figures using the ggplot2 (Wickham 2009) graphics package. This document is structured as a tutorial for building random forest for survival with the randomForestSRC package and using the ggRandomForests package for investigating how the forest is constructed. We analyse the Primary Biliary Cirrhosis of the liver data from a clinical trial at the Mayo Clinic (Fleming and Harrington 1991). Our aim is to demonstrate the strength of using Random Forest methods for both prediction and information retrieval, specifically in time to event data settings.

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Machine Learning

1612.08974

Country: North America > United States > California (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Nephrology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Add feedback

Distributed Real-Time Sentiment Analysis for Big Data Social Streams

Rahnama, Amir Hossein Akhavan

arXiv.org Machine LearningDec-27-2016

Big data trend has enforced the data-centric systems to have continuous fast data streams. In recent years, real-time analytics on stream data has formed into a new research field, which aims to answer queries about what-is-happening-now with a negligible delay. The real challenge with real-time stream data processing is that it is impossible to store instances of data, and therefore online analytical algorithms are utilized. To perform real-time analytics, pre-processing of data should be performed in a way that only a short summary of stream is stored in main memory. In addition, due to high speed of arrival, average processing time for each instance of data should be in such a way that incoming instances are not lost without being captured. Lastly, the learner needs to provide high analytical accuracy measures. Sentinel is a distributed system written in Java that aims to solve this challenge by enforcing both the processing and learning process to be done in distributed form. Sentinel is built on top of Apache Storm, a distributed computing platform. Sentinels learner, Vertical Hoeffding Tree, is a parallel decision tree-learning algorithm based on the VFDT, with ability of enabling parallel classification in distributed environments. Sentinel also uses SpaceSaving to keep a summary of the data stream and stores its summary in a synopsis data structure. Application of Sentinel on Twitter Public Stream API is shown and the results are discussed.

data mining, machine learning, real time system, (20 more...)

arXiv.org Machine Learning

doi: 10.1109/CoDIT.2014.6996998

1612.08543

Country: Europe > Finland (0.14)

Genre: Research Report (0.51)

Industry:

Information Technology (0.89)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(3 more...)

Add feedback

Machine Learning and the Law

#artificialintelligenceDec-25-2016, 03:25:15 GMT

Last week I went to the workshops at NIPS (biggest ML conference in the world) and I also attended part of the ML and the Law symposium the day before. I found out a little bit too late about the symposia but I was still able to attend two panels on which there were both lawyers and computer scientists. They were very insightful and informative -- did you know that this Spring, the European Union passed a regulation giving its citizens a "right to an explanation" for decisions made by machine-learning systems? The panel discussions were motivated by the problem of explaining ML-powered decisions which have an important impact on people's lives: We need to be able to test how systems get to their conclusions; if we can't test, we can't contest. Individuals are entitled to know which data is being processed of them, and to explanations of how predictions & decisions work, in terms they can understand.

artificial intelligence, decision tree learning, machine learning, (11 more...)

#artificialintelligence

Industry: Government (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback