Goto

Collaborating Authors

 Decision Tree Learning


Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation

arXiv.org Machine Learning

We consider multi-class classification where the predictor has a hierarchical structure that allows for a very large number of labels both at train and test time. The predictive power of such models can heavily depend on the structure of the tree, and although past work showed how to learn the tree structure, it expected that the feature vectors remained static. We provide a novel algorithm to simultaneously perform representation learning for the input data and learning of the hierarchi- cal predictor. Our approach optimizes an objec- tive function which favors balanced and easily- separable multi-way node partitions. We theoret- ically analyze this objective, showing that it gives rise to a boosting style property and a bound on classification error. We next show how to extend the algorithm to conditional density estimation. We empirically validate both variants of the al- gorithm on text classification and language mod- eling, respectively, and show that they compare favorably to common baselines in terms of accu- racy and running time.


Decision Trees for Classification

#artificialintelligence

The last post in the Machine Learning algorithm frenzy that I'm currently on was based on Support Vector Machines. This post is going to look at another classification algorithm called a Decision Tree and again see if this can improve on our classification problem of predicting whether lines from Simpson's episodes are said by Bart or Homer. Again, a Decision Tree is a supervised learning algorithm that can be used to classify data based on a model it has built on training data. Like the other models, it tries to split data into two or more sets but the most significant variable that creates the best split is calculated by the algorithm. To distinguish between Homer and Bart from a set of images the decision tree would split the data on each of them and choose which performs best.


Machine Learning: An Introduction to Decision Trees

#artificialintelligence

A decision tree is one of the widely used algorithms for building classification or regression models in data mining and machine learning. A decision tree is so named because the output resulting from it is the form of a tree structure. Consider a sample stock dataset as shown in the table below. The dataset comprises of Open, High, Low, Close Prices and Volume indicators (OHLCV) for the stock. Let us add some technical indicators (RSI, SMA, LMA, ADX) to this dataset.


Strategic Sequences of Arguments for Persuasion Using Decision Trees

AAAI Conferences

Persuasion is an activity that involves one party (the persuader) trying to induce another party (the persuadee) to believe or do something. For this, it can be advantageous forthe persuader to have a model of the persuadee. Recently, some proposals in the field of computational models of argument have been made for probabilistic models of what the persuadee knows about, or believes. However, these developments have not systematically harnessed established notions in decision theory for maximizing the outcome of a dialogue. To address this, we present a general framework for representing persuasion dialogues as a decision tree, and for using decision rules for selecting moves. Furthermore, we provide some empirical results showing how some well-known decision rules perform, and make observations about their general behaviour in the context of dialogues where there is uncertainty about the accuracy of the user model.


Psychological Forest: Predicting Human Behavior

AAAI Conferences

We introduce a synergetic approach incorporating psychological theories and data science in service of predicting human behavior. Our method harnesses psychological theories to extract rigorous features to a data science algorithm. We demonstrate that this approach can be extremely powerful in a fundamental human choice setting. In particular, a random forest algorithm that makes use of psychological features that we derive, dubbed psychological forest, leads to prediction that significantly outperforms best practices in a choice prediction competition. Our results also suggest that this integrative approach is vital for data science tools to perform reasonably well on the data. Finally, we discuss how social scientists can learn from using this approach and conclude that integrating social and data science practices is a highly fruitful path for future research of human behavior.


metboost: Exploratory regression analysis with hierarchically clustered data

arXiv.org Machine Learning

As data collections become larger, exploratory regression analysis becomes more important but more challenging. When observations are hierarchically clustered the problem is even more challenging because model selection with mixed effect models can produce misleading results when nonlinear effects are not included into the model (Bauer and Cai, 2009). A machine learning method called boosted decision trees (Friedman, 2001) is a good approach for exploratory regression analysis in real data sets because it can detect predictors with nonlinear and interaction effects while also accounting for missing data. We propose an extension to boosted decision decision trees called metboost for hierarchically clustered data. It works by constraining the structure of each tree to be the same across groups, but allowing the terminal node means to differ. This allows predictors and split points to lead to different predictions within each group, and approximates nonlinear group specific effects. Importantly, metboost remains computationally feasible for thousands of observations and hundreds of predictors that may contain missing values. We apply the method to predict math performance for 15,240 students from 751 schools in data collected in the Educational Longitudinal Study 2002 (Ingels et al., 2007), allowing 76 predictors to have unique effects for each school. When comparing results to boosted decision trees, metboost has 15% improved prediction performance. Results of a large simulation study show that metboost has up to 70% improved variable selection performance and up to 30% improved prediction performance compared to boosted decision trees when group sizes are small


Smart city transport systems - A*STAR Research

#artificialintelligence

A*STAR researchers have created a program that predicts public transport usage based on land-use and the location of amenities, an essential capability for smart city planning. From schools and shops to hospitals and hotels, a modern city is made of many different parts. Urban planners must take account of where these services are located when designing efficient transit networks. A*STAR researchers have developed a machine-learning program to accurately recreate and predict public transport use, or'ridership', based on the distribution of land-use and amenities in Singapore1. Traditional cities comprise an inner central business district (CBD), where most people work, surrounded by outer residential and industrial zones.


r2VIM: A new variable selection method for random forests in genome-wide association studies

#artificialintelligence

In the last few years, more than one thousand single-nucleotide polymorphisms (SNPs) have been reproducibly associated with more than two hundred phenotypes and quantitative traits in genome-wide association studies (GWAS) [1]. These loci are usually identified by linear or logistic regression analysis which is performed separately for each SNP. The resulting p-values are then used to rank the SNPs and to select those with a p-value smaller than a pre-specified significance level which is adjusted for the large number of statistical tests. In such a scenario, comparable to analyses of other genomic data sets such as gene expression, p-values are not used in a confirmatory setting but rather as a screening tool to identify associated, i.e. important, SNPs while controlling the number of false positive findings. Nonparametric, model-free statistical learning machines provide a promising alternative to classical, model-based statistical methods for the selection of important variables in high dimensional data sets.


Top R Packages for Machine Learning

#artificialintelligence

Much of our curriculum is based on feedback from corporate and government partners about the technologies they are looking to learn. But we wanted to develop a more data-driven approach to what we should be teaching in our data science corporate training and our free fellowship for masters and PhDs looking to enter data science careers in industry. What are the most popular ML packages? Let's look at a ranking based on package downloads and social website activity. The ranking is based on average rank of CRAN (The Comprehensive R Archive Network) downloads and Stack Overflow activity (full ranking here [CSV]).


What developers actually need to know about Machine Learning

#artificialintelligence

Something is wrong in the way ML is being taught to developers. Most ML teachers like to explain how different learning algorithms work and spend tons of time on that. For a beginner who wants to start using ML, being able to choose an algorithm and set parameters looks like the #1 barrier to entry, and knowing how the different techniques work seems to be a key requirement to remove that barrier. Many practitioners argue however that you only need one technique to get started: random forests. Other techniques may sometimes outperform them, but in general, random forests are the most likely to perform best on a variety of problems (see Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?), which makes them more than enough for a developer just getting started with ML.