AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Random forests with random projections of the output space for high dimensional multi-label classification

Joly, Arnaud, Geurts, Pierre, Wehenkel, Louis

arXiv.org Machine LearningSep-29-2014

We adapt the idea of random projections applied to the output space, so as to enhance tree-based ensemble methods in the context of multi-label classification. We show how learning time complexity can be reduced without affecting computational complexity and accuracy of predictions. We also show that random output space projections may be used in order to reach different bias-variance tradeoffs, over a broad panel of benchmark problems, and that this may lead to improved accuracy while reducing significantly the computational burden of the learning stage.

artificial intelligence, machine learning, projection, (20 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-662-44848-9_39

1404.3581

Country: Europe (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.85)

Add feedback

Interpreting Tree Ensembles with inTrees

Deng, Houtao

arXiv.org Machine LearningAug-23-2014

Tree ensembles such as random forests and boosted trees are accurate but difficult to understand, debug and deploy. In this work, we provide the inTrees (interpretable trees) framework that extracts, measures, prunes and selects rules from a tree ensemble, and calculates frequent variable interactions. An rule-based learner, referred to as the simplified tree ensemble learner (STEL), can also be formed and used for future prediction. The inTrees framework can applied to both classification and regression problems, and is applicable to many types of tree ensembles, e.g., random forests, regularized random forests, and boosted trees. We implemented the inTrees algorithms in the "inTrees" R package.

artificial intelligence, machine learning, tree ensemble, (17 more...)

arXiv.org Machine Learning

1408.5456

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

BET: Bayesian Ensemble Trees for Clustering and Prediction in Heterogeneous Data

Duan, Leo L., Clancy, John P., Szczesniak, Rhonda D.

arXiv.org Machine LearningAug-18-2014

We propose a novel "tree-averaging" model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian ensemble trees (BET) and model them as an infinite mixture Dirichlet process. We show that BET adapts to data heterogeneity and accurately estimates each component. Compared with the bootstrap-aggregating approach, BET shows improved prediction performance with fewer trees. We develop an efficient estimating procedure with improved sampling strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations, classification of breast cancer and regression of lung function measurement of cystic fibrosis patients.

artificial intelligence, machine learning, random forest, (18 more...)

arXiv.org Machine Learning

1408.414

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.49)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.37)
Health & Medicine > Therapeutic Area > Genetic Disease (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

An Evasion and Counter-Evasion Study in Malicious Websites Detection

Xu, Li, Zhan, Zhenxin, Xu, Shouhuai, Ye, Keyin

arXiv.org Artificial IntelligenceAug-8-2014

Malicious websites are a major cyber attack vector, and effective detection of them is an important cyber defense task. The main defense paradigm in this regard is that the defender uses some kind of machine learning algorithms to train a detection model, which is then used to classify websites in question. Unlike other settings, the following issue is inherent to the problem of malicious websites detection: the attacker essentially has access to the same data that the defender uses to train its detection models. This 'symmetry' can be exploited by the attacker, at least in principle, to evade the defender's detection models. In this paper, we present a framework for characterizing the evasion and counter-evasion interactions between the attacker and the defender, where the attacker attempts to evade the defender's detection models by taking advantage of this symmetry. Within this framework, we show that an adaptive attacker can make malicious websites evade powerful detection models, but proactive training can be an effective counter-evasion defense mechanism. The framework is geared toward the popular detection model of decision tree, but can be adapted to accommodate other classifiers.

algorithm, attacker, feature vector, (16 more...)

arXiv.org Artificial Intelligence

1408.1993

Country: North America > United States > Texas (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.66)

Add feedback

Decision Trees for Function Evaluation - Simultaneous Optimization of Worst and Expected Cost

Cicalese, Ferdinando, Laber, Eduardo, Saettler, Aline Medeiros

arXiv.org Artificial IntelligenceJul-26-2014

In several applications of automatic diagnosis and active learning a central problem is the evaluation of a discrete function by adaptively querying the values of its variables until the values read uniquely determine the value of the function. In general, the process of reading the value of a variable might involve some cost, computational or even a fee to be paid for the experiment required for obtaining the value. This cost should be taken into account when deciding the next variable to read. The goal is to design a strategy for evaluating the function incurring little cost (in the worst case or in expectation according to a prior distribution on the possible variables' assignments). Our algorithm builds a strategy (decision tree) which attains a logarithmic approxima- tion simultaneously for the expected and worst cost spent. This is best possible under the assumption that $P \neq NP.$

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1309.2796

Country:

South America > Brazil (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Italy (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.64)

Add feedback

Recognizing Blind Spot Check Activity with Car Drivers Based on Decision Tree Classifier Approach

Kedowide, Colombiano (Télé-Université du Québec) | Gouin-Vallerand, Charles (Télé-Université du Québec) | Vallieres, Evelyne (Télé-Université du Québec)

AAAI ConferencesJul-22-2014

Blind spot check is important driving activity that is a good indicator of drivers’ proficiency and vigilance. By recognizing the blind spot check activity with drivers, it is possible to quantify and qualify the proficiency of the drivers, but also to cross validate this information with other data such the fatigue level. Thus, in this paper, we present a blind spot check activity recognition system where decision tree classifiers are modeled for each drivers and are used to automatically recognize the blind spot checks.

car driver, decision tree classifier approach, recognizing blind spot check activity

AAAI Conferences

Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.60)

Add feedback

On Boosting Sparse Parities

Reyzin, Lev (University of Illinois at Chicago)

AAAI ConferencesJul-14-2014

While boosting has been extensively studied, considerablyless attention has been devoted to the task of designing good weaklearning algorithms. In this paper we consider the problem of designing weak learners thatare especially adept to the boosting procedure and specifically the AdaBoost algorithm. First we describe conditions desirable for a weak learning algorithm. We then propose using sparse parity functions as weak learners, which have many of our desired properties, as weak learners in boosting. Our experimental tests show the proposed weak learners tobe competitive with the most widely used ones: decisionstumps and pruned decision trees.

algorithm, learner, weak learner, (14 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Growing Regression Forests by Classification: Applications to Object Pose Estimation

Hara, Kota, Chellappa, Rama

arXiv.org Machine LearningJul-14-2014

In this work, we propose a novel node splitting method for regression trees and incorporate it into the regression forest framework. Unlike traditional binary splitting, where the splitting rule is selected from a predefined set of binary splitting rules via trial-and-error, the proposed node splitting method first finds clusters of the training data which at least locally minimize the empirical loss without considering the input space. Then splitting rules which preserve the found clusters as much as possible are determined by casting the problem into a classification problem. Consequently, our new node splitting method enjoys more freedom in choosing the splitting rules, resulting in more efficient tree structures. In addition to the Euclidean target space, we present a variant which can naturally deal with a circular target space by the proper use of circular statistics. We apply the regression forest employing our node splitting to head pose estimation (Euclidean target space) and car direction estimation (circular target space) and demonstrate that the proposed method significantly outperforms state-of-the-art methods (38.5% and 22.5% error reduction respectively).

regression forest, splitting rule, target space, (14 more...)

arXiv.org Machine Learning

1312.643

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Cross-Lingual Knowledge Validation Based Taxonomy Derivation from Heterogeneous Online Wikis

AAAI ConferencesJul-14-2014

Creating knowledge bases based on the crowd-sourced wikis, like Wikipedia, has attracted significant research interest in the field of intelligent Web. However, the derived taxonomies usually contain many mistakenly imported taxonomic relations due to the difference between the user-generated subsumption relations and the semantic taxonomic relations. Current approaches to solving the problem still suffer the following issues: (i) the heuristic-based methods strongly rely on specific language dependent rules. (ii) the corpus-based methods depend on a large-scale high-quality corpus, which is often unavailable. In this paper, we formulate the cross-lingual taxonomy derivation problem as the problem of cross-lingual taxonomic relation prediction. We investigate different linguistic heuristics and language independent features, and propose a cross-lingual knowledge validation based dynamic adaptive boosting model to iteratively reinforce the performance of taxonomic relation prediction. The proposed approach successfully overcome the above issues, and experiments show that our approach significantly outperforms the designed state-of-the-art comparison methods.

knowledge management, machine learning, relation, (18 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

Regression Trees for Longitudinal Data

Kundu, Madan Gopal, Harezlak, Jaroslaw

arXiv.org Machine LearningJul-13-2014

While studying response trajectory, often the population of interest may be diverse enough to exist distinct subgroups within it and the longitudinal change in response may not be uniform in these subgroups. That is, the timeslope and/or influence of covariates in longitudinal profile may vary among these different subgroups. For example, Raudenbush (2001) used depression as an example to argue that it is incorrect to assume that all the people in a given population would be experiencing either increasing or decreasing levels of depression. In such cases, traditional linear mixed effects model (assuming common parametric form for covariates and time) is not directly applicable for the entire population as a group-averaged trajectory can mask important subgroup differences. Our aim is to identify and characterize longitudinally homogeneous subgroups based on the combination of baseline covariates in the most parsimonious way. This goal can be achieved via constructing regression tree for longitudinal data using baseline covariates as partitioning variables. We have proposed LongCART algorithm to construct regression tree for the longitudinal data. In each node, the proposed LongCART algorithm determines the need for further splitting (i.e. whether parameter(s) of longitudinal profile is influenced by any baseline attributes) via parameter instability tests and thus the decision of further splitting is type-I error controlled. We have obtained the asymptotic results for the proposed instability test and examined finite sample behavior of the whole algorithm through simulation studies. Finally, we have applied the LongCART algorithm to study the longitudinal changes in choline level among HIV patients.

artificial intelligence, machine learning, subgroup, (15 more...)

arXiv.org Machine Learning

1309.7733

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback