AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Machine Learning in Python has never been easier

@machinelearnbotMar-26-2016, 04:55:13 GMT

At BigML we believe that over the next few years automated, data-driven decisions and data-driven applications are going to change the world. In fact, we think it will be the biggest shift in business efficiency since the dawn of the office calculator, when individuals had "Computer" listed as the title on their business card. We want to help people rapidly and easily create predictive models using their datasets, no matter what size they are. Our easy-to-use, public API is a great step in that direction but a few bindings for popular languages is obviously a big bonus. Thus, we are very happy to announce an open source Python binding to BigML.io, the BigML REST API. You can find it and fork it at Github.

artificial intelligence, decision tree learning, machine learning, (12 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.32)

Add feedback

Lost in a random forest: Using Big Data to study rare events News & Analysis

#artificialintelligenceMar-24-2016, 21:45:43 GMT

Sudden, broad-scale shifts in public opinion about social problems are relatively rare. Until recently, social scientists were forced to conduct post-hoc case studies of such unusual events that ignore the broader universe of possible shifts in public opinion that do not materialize. The vast amount of data that has recently become available via social media sites such as Facebook and Twitter--as well as the mass-digitization of qualitative archives provide an unprecedented opportunity for scholars to avoid such selection on the dependent variable. Yet the sheer scale of these new data creates a new set of methodological challenges. Conventional linear models, for example, minimize the influence of rare events as "outliers"--especially within analyses of large samples.

artificial intelligence, data mining, machine learning, (15 more...)

#artificialintelligence

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
(2 more...)

Add feedback

How to Bin or Convert Numerical Variables to Categorical Variables with Decision Trees

@machinelearnbotMar-23-2016, 19:55:17 GMT

Why would you want to convert a numerical variable into categorical one? Depending on the situation, it can lead to a better interpretation of the numerical variable, quick segmentation or just an additional feature for building your predictive model by creating bins for the numerical variable. Binning is a popular feature engineering technique. Suppose your hypothesis is that the age of a customer is correlated with their tendency to interact with a mobile app. The age of the user is plotted on x-axis and user interaction with the app is plotted on the y-axis.

artificial intelligence, decision tree learning, machine learning, (15 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.48)

Add feedback

How can decision tree model in Spark (pyspark) be visualized?

#artificialintelligenceMar-21-2016, 21:56:27 GMT

I am trying to visualize decision tree structure in pyspark. But all the tools are for data. I could not find any for visualizing tree structure. Or is there a way I can visualize using the rules from toDebugString?

decision tree learning, decision tree model, machine learning, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Annotated Decision Trees for Simple Moral Machines

Bendel, Oliver (Northwestern Switzerland School of Business)

AAAI ConferencesMar-16-2016

Autonomization often follows after the automization on which it is based. More and more machines have to make decisions with moral implications. Machine ethics, which can be seen as an equivalent of human ethics, analyses the chances and limits of moral machines. So far, decision trees have not been commonly used for modelling moral machines. This article proposes an approach for creating annotated decision trees, and specifies their central components. The focus is on simple moral machines. The chances of such models are illustrated with the example of a self-driving car that is friendly to humans and animals. Finally the advantages and disadvantages are discussed and conclusions are drawn.

assumption, decision tree, moral machine, (13 more...)

AAAI Conferences

2016 AAAI Spring Symposium Series

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
(4 more...)

Industry:

Transportation > Passenger (1.00)
Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.89)
Information Technology (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.86)

Add feedback

Evaluation of Protein Structural Models Using Random Forests

Cao, Renzhi, Jo, Taeho, Cheng, Jianlin

arXiv.org Machine LearningFeb-12-2016

Protein structure prediction has been a "grand challenge" problem in the structure biology over the last few decades. Protein quality assessment plays a very important role in protein structure prediction. In the paper, we propose a new protein quality assessment method which can predict both local and global quality of the protein 3D structural models. Our method uses both multi and single model quality assessment method for global quality assessment, and uses chemical, physical, geometrical features, and global quality score for local quality assessment. CASP9 targets are used to generate the features for local quality assessment. We evaluate the performance of our local quality assessment method on CASP10, which is comparable with two stage-of-art QA methods based on the average absolute distance between the real and predicted distance. In addition, we blindly tested our method on CASP11, and the good performance shows that combining single and multiple model quality assessment method could be a good way to improve the accuracy of model quality assessment, and the random forest technique could be used to train a good local quality assessment model.

artificial intelligence, decision tree learning, machine learning, (15 more...)

arXiv.org Machine Learning

1602.04277

Country:

North America > United States > Missouri (0.29)
North America > United States > Michigan (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

MPBART - Multinomial Probit Bayesian Additive Regression Trees

Kindo, Bereket P., Wang, Hao, Peña, Edsel A.

arXiv.org Machine LearningFeb-6-2016

Multinomial probit (MNP) model for discrete choice modeling is often used in economics, market research, political sciences and transportation. It models the choices made by agents given their demographic characteristics and/or the features of the K 2 available choice alternatives. Examples include the study of consumer's purchasing behavior (e.g., McCulloch et al. (2000); Imai and van Dyk (2005)); voting behavior in multi-party elections (e.g., Quinn et al. (1999)); and choice of different modes of transportation (e.g., Bolduc (1999)). Details of the MNP model in which choices depend on predictors in a linear fashion is studied in McFadden et al.(1973); McFadden(1989); Keane(1992); McCulloch and Rossi (1994); Nobile (1998); McCulloch et al. (2000); Imai and van Dyk (2005); Train (2009); Burgette and Nordheim (2012) among others. Among widely used multinomial choice modeling procedures are the multinomial logit model (e.g., McFadden et al. (1973); Train (2009)) and multinomial probit model (e.g., McFadden (1989); McCulloch and Rossi (1994); Imai and van Dyk (2005)). The former relies on an assumption that a choice outcome is independent of removal (or introduction) of an irrelevant choice alternative while the latter including MPBART does not make this restrictive assumption.

artificial intelligence, decision tree learning, machine learning, (15 more...)

arXiv.org Machine Learning

1309.7821

Country:

North America > United States > Michigan (0.28)
North America > United States > South Carolina > Richland County > Columbia (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.94)
Government > Voting & Elections (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.84)

Add feedback

Finding structure in data using multivariate tree boosting

Miller, Patrick J., Lubke, Gitta H., McArtor, Daniel B., Bergeman, C. S.

arXiv.org Machine LearningJan-21-2016

Technology and collaboration enable dramatic increases in the size of psychological and psychiatric data collections, but finding structure in these large data sets with many collected variables is challenging. Decision tree ensembles like random forests (Strobl, Malley, and Tutz, 2009) are a useful tool for finding structure, but are difficult to interpret with multiple outcome variables which are often of interest in psychology. To find and interpret structure in data sets with multiple outcomes and many predictors (possibly exceeding the sample size), we introduce a multivariate extension to a decision tree ensemble method called Gradient Boosted Regression Trees (Friedman, 2001). Our method, multivariate tree boosting, can be used for identifying important predictors, detecting predictors with non-linear effects and interactions without specification of such effects, and for identifying predictors that cause two or more outcome variables to covary without parametric assumptions. We provide the R package 'mvtboost' to estimate, tune, and interpret the resulting model, which extends the implementation of univariate boosting in the R package 'gbm' (Ridgeway, 2013) to continuous, multivariate outcomes. To illustrate the approach, we analyze predictors of psychological well-being (Ryff and Keyes, 1995). Simulations verify that our approach identifies predictors with non-linear effects and achieves high prediction accuracy, exceeding or matching the performance of (penalized) multivariate multiple regression and multivariate decision trees over a wide range of conditions.

artificial intelligence, machine learning, predictor, (18 more...)

arXiv.org Machine Learning

doi: 10.1037/met0000087

1511.02025

Country: North America > United States > New York (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

A Framework to Adjust Dependency Measure Estimates for Chance

Romano, Simone, Vinh, Nguyen Xuan, Bailey, James, Verspoor, Karin

arXiv.org Machine LearningJan-20-2016

Estimating the strength of dependency between two variables is fundamental for exploratory analysis and many other applications in data mining. For example: non-linear dependencies between two continuous variables can be explored with the Maximal Information Coefficient (MIC); and categorical variables that are dependent to the target class are selected using Gini gain in random forests. Nonetheless, because dependency measures are estimated on finite samples, the interpretability of their quantification and the accuracy when ranking dependencies become challenging. Dependency estimates are not equal to 0 when variables are independent, cannot be compared if computed on different sample size, and they are inflated by chance on variables with more categories. In this paper, we propose a framework to adjust dependency measure estimates on finite samples. Our adjustments, which are simple and applicable to any dependency measure, are helpful in improving interpretability when quantifying dependency and in improving accuracy on the task of ranking dependencies. In particular, we demonstrate that our approach enhances the interpretability of MIC when used as a proxy for the amount of noise between variables, and to gain accuracy when ranking variables during the splitting procedure in random forests.

artificial intelligence, dependency, machine learning, (18 more...)

arXiv.org Machine Learning

1510.07786

Genre: Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Add feedback

Directional Decision Lists

Goessling, Marc, Kang, Shan

arXiv.org Machine LearningJan-10-2016

In this paper we introduce a novel family of decision lists consisting of highly interpretable models which can be learned efficiently in a greedy manner. The defining property is that all rules are oriented in the same direction. Particular examples of this family are decision lists with monotonically decreasing (or increasing) probabilities. On simulated data we empirically confirm that the proposed model family is easier to train than general decision lists. We exemplify the practical usability of our approach by identifying problem symptoms in a manufacturing process.

artificial intelligence, decision list, machine learning, (18 more...)

arXiv.org Machine Learning

1508.07643

Country: North America > United States (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
(2 more...)

Add feedback