AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Random Forests in Python

#artificialintelligenceDec-4-2016, 21:35:23 GMT

This post originally appeared on the Yhat blog. Yhat is a Brooklyn based company whose goal is to make data science applicable for developers, data scientists, and businesses alike. Yhat provides a software platform for deploying and managing predictive algorithms as REST APIs, while eliminating the painful engineering obstacles associated with production environments like testing, versioning, scaling and security. It can be used to on customer acquisition, retention, and churn or to in patients. Random forest is capable of regression and classification. It can handle a large number of features, and it's helpful for estimating which of your variables are important in the underlying data being modeled.

artificial intelligence, machine learning, random forest, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.79)

Add feedback

How to Implement Bagging From Scratch With Python - Machine Learning Mastery

#artificialintelligenceDec-3-2016, 01:35:14 GMT

Decision trees are a simple and powerful predictive modeling technique, but they suffer from high-variance. This means that trees can get very different results given different training data. A technique to make decision trees more robust and to achieve better performance is called bootstrap aggregation or bagging for short. In this tutorial, you will discover how to implement the bagging procedure with decision trees from scratch with Python. How to apply bagging to your own predictive modeling problems.

artificial intelligence, decision tree learning, machine learning, (16 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.32)

Add feedback

A New Method for Classification of Datasets for Data Mining

Vijendra, Singh, Parashar, Hemjyotsana, Vasudeva, Nisha

arXiv.org Machine LearningDec-1-2016

Humans have been manually extracting patterns from data for centuries, but the increasing volume of data in modern times has called for more automated approaches. Information leads to power and success, and thanks to sophisticated technologies such as computers, satellites, etc., we have been collecting tremendous amounts of information. Initially, with the advent of computers and means for mass digital storage, we started collecting and storing all sorts of data, counting on the power of computers to help sort through this amalgam of information. Unfortunately, these massive collections of data stored on disparate structures very rapidly became overwhelming. A variety of information collected in digital form in databases and in flat files.

artificial intelligence, decision tree learning, machine learning, (13 more...)

arXiv.org Machine Learning

1612.00151

Country: Asia > China (0.30)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.72)

Add feedback

Top Data Mining Algorithms Identified by IEEE & Related Python Resources

@machinelearnbotNov-30-2016, 10:15:02 GMT

C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier. C4.5 builds decision trees from a set of training data in the same way as ID3, using the concept of information entropy. The training data is a set S {s_1, s_2, ...} of already classified samples. Each sample s_i consists of a p-dimensional vector (x_{1,i}, x_{2,i}, ...,x_{p,i}), where the x_j represent attributes or features of the sample, as well as the class in which s_i falls.

artificial intelligence, machine learning, top data mining algorithm identified, (6 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Logarithmic Time One-Against-Some

Daume, Hal III, Karampatziakis, Nikos, Langford, John, Mineiro, Paul

arXiv.org Machine LearningNov-30-2016

We create a new online reduction of multiclass classification to binary classification for which training and prediction time scale logarithmically with the number of classes. Compared to previous approaches, we obtain substantially better statistical performance for two reasons: First, we prove a tighter and more complete boosting theorem, and second we translate the results more directly into an algorithm. We show that several simple techniques give rise to an algorithm that can compete with one-against-all in both space and predictive power while offering exponential improvements in speed when the number of classes is large.

artificial intelligence, machine learning, predictor, (19 more...)

arXiv.org Machine Learning

1606.04988

Country:

North America > United States (0.28)
Europe > Spain (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.48)

Add feedback

Auditing Black-box Models for Indirect Influence

Adler, Philip, Falk, Casey, Friedler, Sorelle A., Rybeck, Gabriel, Scheidegger, Carlos, Smith, Brandon, Venkatasubramanian, Suresh

arXiv.org Machine LearningNov-30-2016

Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. It is therefore hard to acquire a deeper understanding of model behavior, and in particular how different features influence the model prediction. This is important when interpreting the behavior of complex models, or asserting that certain problematic attributes (like race or gender) are not unduly influencing decisions. In this paper, we present a technique for auditing black-box models, which lets us study the extent to which existing models take advantage of particular features in the dataset, without knowing how the models work. Our work focuses on the problem of indirect influence: how some features might indirectly influence outcomes via other, related features. As a result, we can find attribute influences even in cases where, upon further direct examination of the model, the attribute is not referred to by the model at all. Our approach does not require the black-box model to be retrained. This is important if (for example) the model is only accessible via an API, and contrasts our work with other methods that investigate feature influence like feature selection. We present experimental evidence for the effectiveness of our procedure using a variety of publicly available datasets and models. We also validate our procedure using techniques from interpretable learning and feature selection, as well as against other black-box auditing procedures.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

1602.07043

Country: North America > United States > Arizona (0.28)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Machine learning as a service ? Might lose sleep over this !

#artificialintelligenceNov-29-2016, 12:55:15 GMT

This post is'not' intended to teach people how to use popular predictive modelling APIs for free. Although, to your surprise, this isn't a far fetched possibility. Trained Machine learning models are basically a function that maps feature vectors to the output variable. Upon querying with a test instance, the model predicts an outcome, assigning probability scores to all the possible classes. Google, Amazon etc provides public facing APIs to train predictive models on the subscriber's data, the model can further be used for prediction purposes .

artificial intelligence, data mining, machine learning, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)
Information Technology > Data Science > Data Mining > Feature Extraction (0.38)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.34)

Add feedback

Titanic: Machine Learning from Disaster

#artificialintelligenceNov-29-2016, 09:35:20 GMT

If you're new to data science and machine learning, or looking for a simple intro to the Kaggle competitions platform, this is the best place to start. Continue reading below the competition description to discover a number of tutorials, benchmark models, and more. The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships.

artificial intelligence, decision tree learning, machine learning, (11 more...)

#artificialintelligence

Genre: Instructional Material (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.33)

Add feedback

When Does Deep Learning Work Better Than SVMs or Random Forests?

@machinelearnbotNov-27-2016, 08:05:04 GMT

Guest blog by Sebastian Raschka, originally posted here. If we tackle a supervised learning problem, my advice is to start with the simplest hypothesis space first. I.e., try a linear model such as logistic regression. If this doesn't work "well" (i.e., it doesn't meet our expectation or performance criterion that we defined earlier), I would move on to the next experiment. I would say that random forests are probably THE "worry-free" approach - if such a thing exists in ML: There are no real hyperparameters to tune (maybe except for the number of trees; typically, the more trees we have the better).

artificial intelligence, decision tree learning, machine learning, (15 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

Add feedback

Machine Learning with Talend - Getting Started

#artificialintelligenceNov-26-2016, 07:15:42 GMT

Decision trees are used extensively in machine learning because they are easy to use, easy to interpret, and easy to operationalize. KD Nuggets, one of the most respected sites for data science and machine learning, recently published an article that identified decision trees as a "top 10" algorithm for machine learning. If you are new to machine learning, some of these concepts may be unfamiliar. The goal of this blog is to provide you with the basics of decision trees using Talend and Apache Spark. If you want to learn more about advanced analytics, please see the references section below.(2)

artificial intelligence, kyphosis, machine learning, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback