AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Python Tutorial for Beginners: Learn in 3 Days

#artificialintelligenceSep-14-2017, 02:00:24 GMT

In the syntax below, we are asking Python to import numpy and pandas package. The'as' is used to alias package name.

data mining, machine learning, python, (21 more...)

#artificialintelligence

Country:

Asia > India > Maharashtra > Mumbai (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (0.50)
Research Report > Experimental Study (0.47)

Industry: Education (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)
(2 more...)

Add feedback

Random Forests of Interaction Trees for Estimating Individualized Treatment Effects in Randomized Trials

Su, Xiaogang, Peña, Annette T., Liu, Lei, Levine, Richard A.

arXiv.org Machine LearningSep-14-2017

Assessing heterogeneous treatment effects has become a growing interest in advancing precision medicine. Individualized treatment effects (ITE) play a critical role in such an endeavor. Concerning experimental data collected from randomized trials, we put forward a method, termed random forests of interaction trees (RFIT), for estimating ITE on the basis of interaction trees (Su et al., 2009). To this end, we first propose a smooth sigmoid surrogate (SSS) method, as an alternative to greedy search, to speed up tree construction. RFIT outperforms the traditional `separate regression' approach in estimating ITE. Furthermore, standard errors for the estimated ITE via RFIT can be obtained with the infinitesimal jackknife method. We assess and illustrate the use of RFIT via both simulation and the analysis of data from an acupuncture headache trial.

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Machine Learning

1709.04862

Country: North America > United States > Texas (0.28)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Understanding Boosted Trees Models

#artificialintelligenceSep-13-2017, 08:55:21 GMT

In the previous post, we learned about tree based learning methods - basics of tree based models and the use of bagging to reduce variance. We also looked at one of the most famous learning algorithms based on the idea of bagging- random forests. In this post, we will look into the details of yet another type of tree-based learning algorithms: boosted trees. Boosting, similar to Bagging, is a general class of learning algorithm where a set of weak learners are combined to get strong learners. For classification problems, a weak learner is defined to be a classifier which is only slightly correlated with the true classification (it can label examples better than random guessing). In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification. Recall that bagging involves creating multiple copies of the original training data set via bootstrapping, fitting a separate decision tree to each copy, and then combining all of the trees in order to create a single predictive model.

algorithm, artificial intelligence, machine learning, (19 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

Intro to The Data Science Behind EEG-Based Neurobiofeedback

#artificialintelligenceSep-10-2017, 17:55:11 GMT

The Neurobiofeedback machine gained popularity for its non-invasive and quantitative approach to behavior regulation, but its legitimacy remains in question by pediatricians, therapists, and other professionals. In academic-sounding terms, this machine (which I'll be abbreviating as NBF from now on) is built on the concept of feedback therapy, which exploits our ability to exert and/or regain control over physiological aspects in our body. NBF is a type of Brain-Computer Interface (BCI) machine that senses your brain wave activity in different ways (usually involving hardware-software interaction) and rewards you with an auditory or visual stimulus when your brain wave's frequency matches the desired frequency. This comes from the scientific notion that brain rhythms correspond to certain cognitive states. By "mind games", the'auditory or visual stimulus' I mentioned last paragraph usually comes in the form of a game.

artificial intelligence, decision tree learning, machine learning, (17 more...)

#artificialintelligence

Country:

Asia > China > Hong Kong (0.05)
Europe > Switzerland > Basel-City > Basel (0.05)
Europe > Sweden > Skåne County > Malmö (0.05)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)

Industry: Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (0.45)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.33)

Add feedback

Machine learning leveraging genomes from metagenomes identifies influential antibiotic resistance genes in the infant gut microbiome

#artificialintelligenceSep-8-2017, 11:35:15 GMT

Antibiotic resistance in pathogens is extensively studied, yet little is known about how antibiotic resistance genes of typical gut bacteria influence microbiome dynamics. Here, we leverage genomes from metagenomes to investigate how genes of the premature infant gut resistome correspond to the ability of bacteria to survive under certain environmental and clinical conditions. We find that formula feeding impacts the resistome. Random forest models corroborated by statistical tests revealed that the gut resistome of formula-fed infants is enriched in class D beta-lactamase genes. Interestingly, Clostridium difficile strains harboring this gene are at higher abundance in formula-fed infants compared to C. difficile lacking this gene.

antibiotic resistance gene, artificial intelligence, machine learning, (10 more...)

#artificialintelligence

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.42)

Add feedback

Crowdsourcing Predictors of Residential Electric Energy Usage

Wagy, Mark D., Bongard, Josh C., Bagrow, James P., Hines, Paul D. H.

arXiv.org Machine LearningSep-8-2017

Crowdsourcing has been successfully applied in many domains including astronomy, cryptography and biology. In order to test its potential for useful application in a Smart Grid context, this paper investigates the extent to which a crowd can contribute predictive hypotheses to a model of residential electric energy consumption. In this experiment, the crowd generated hypotheses about factors that make one home different from another in terms of monthly energy usage. To implement this concept, we deployed a web-based system within which 627 residential electricity customers posed 632 questions that they thought predictive of energy usage. While this occurred, the same group provided 110,573 answers to these questions as they accumulated. Thus users both suggested the hypotheses that drive a predictive model and provided the data upon which the model is built. We used the resulting question and answer data to build a predictive model of monthly electric energy consumption, using random forest regression. Because of the sparse nature of the answer data, careful statistical work was needed to ensure that these models are valid. The results indicate that the crowd can generate useful hypotheses, despite the sparse nature of the dataset.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

arXiv.org Machine Learning

1709.02739

Country:

Europe (0.67)
North America > United States > Vermont > Chittenden County > Burlington (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Energy > Power Industry > Utilities (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)

Add feedback

Random Subspace with Trees for Feature Selection Under Memory Constraints

Sutera, Antonio, Châtel, Célia, Louppe, Gilles, Wehenkel, Louis, Geurts, Pierre

arXiv.org Machine LearningSep-6-2017

Célia Châtel Aix-Marseille University, France Pierre Geurts University of Liège, Belgium Dealing with datasets of very high dimension is a major challenge in machine learning. In this paper, we consider the problem of feature selection in applications where the memory is not large enough to contain all features. In this setting, we propose a novel tree-based feature selection approach that builds a sequence of randomized trees on small subsamples of variables mixing both variables already identified as relevant by previous models and variables randomly selected among the other variables. As our main contribution, we provide an in-depth theoretical analysis of this method in infinite sample setting. In particular, we study its soundness with respect to common definitions of feature relevance and its convergence speed under various variable dependance scenarios. We also provide some preliminary empirical results highlighting the potential of the approach.

artificial intelligence, machine learning, relevant variable, (20 more...)

arXiv.org Machine Learning

1709.01177

Country:

Europe > Belgium > Wallonia > Liège Province > Liège (0.25)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.24)

Genre: Research Report > Experimental Study (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)

Add feedback

GIS and Machine Learning for Habitat Protection GIS Lounge

@machinelearnbotSep-3-2017, 03:20:07 GMT

With machine learning having become a typical application along with GIS, one area of focus has been habitat protection. Habitat managers and conservation specialists have struggled to find ways in which to protect wildlife threatened by a variety of mostly-human induced factors. Machine learning and GIS have proven one way in which new ideas and scenarios can be tested before any plan is carried out, saving time, money, and possibly avoiding making crucial habitat errors in plans implemented. A recent example of using GIS and machine learning for habitat protection has been applied on the black-necked crane.[1] This type of bird is very particular with where it can breed and relatively little is known about it.

artificial intelligence, decision tree learning, habitat protection gis lounge, (9 more...)

@machinelearnbot

Country:

North America > United States > Wisconsin (0.06)
North America > Canada > Ontario (0.05)
Asia > Pakistan (0.05)
(2 more...)

Industry: Food & Agriculture (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.30)

Add feedback

Understanding random forests with randomForestExplainer

#artificialintelligenceSep-1-2017, 12:15:33 GMT

Next, we pass it to the function plot_min_depth_distribution and under default settings obtain obtain a plot of the distribution of minimal depth for top ten variables according to mean minimal depth calculated using top trees (mean_sample "top_trees"). We could also pass our forest directly to the plotting function but if we want to make more than one plot of the minimal depth distribution is more efficient to pass the min_depth_frame to the plotting function so that it will not be calculated again for each plot (this works similarly for other plotting functions of randomForestExplainer). The function plot_min_depth_distribution offers three possibilities when it comes to calculating the mean minimal depth, which differ in he way they treat missing values that appear when a variable is not used for splitting in a tree. Note that the depth of a tree is equal to the length of the longest path from root to leave in this tree. This equals the maximum depth of a variable in this tree plus one, as leaves are by definition not split by any variable.

artificial intelligence, machine learning, minimal depth, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.40)

Add feedback

A Complete Tutorial on Tree Based Modeling from Scratch (in R & Python)

#artificialintelligenceAug-29-2017, 12:35:35 GMT

Tree based learning algorithms are considered to be one of the best and mostly used supervised learning methods. Tree based methods empower predictive models with high accuracy, stability and ease of interpretation. Unlike linear models, they map non-linear relationships quite well. They are adaptable at solving any kind of problem at hand (classification or regression). Methods like decision trees, random forest, gradient boosting are being popularly used in all kinds of data science problems. Hence, for every analyst (fresher also), it's important to learn these algorithms and use them for modeling. This tutorial is meant to help beginners learn tree based modeling from scratch. After the successful completion of this tutorial, one is expected to become proficient at using tree based algorithms and build predictive models. Note: This tutorial requires no prior knowledge of machine learning.

algorithm, artificial intelligence, machine learning, (18 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback