AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Plan-Based Policy-Learning for Autonomous Feature Tracking

Fox, Maria (King's College London) | Long, Derek (King's College London ) | Magazzeni, Daniele (King's College London)

AAAI ConferencesJun-8-2012

Mapping and tracking biological ocean features, such as harmful algal blooms, is an important problem in the environmental sciences. The problem exhibits a high degree of uncertainty, because of both the dynamic ocean context and the challenges of sensing. Plan-based policy learning has been shown to be a powerful technique for obtaining robust intelligent behaviour in the face of uncertainty. In this paper we apply this technique in simulation, to the problem of tracking the outer edge of 2D biological features, such as the surfaces of harmful algal blooms. We show that plan-based policy-learning leads to highly accurate tracking in simulation, even in situations where the uncertainty governing the shape of the patch cannot be directly modelled. We present simulation results that give confidence that the approach could work in practice. We are now collaborating with ocean scientists at MBARI to perform physical tests at sea.

auv, contour, threshold, (17 more...)

AAAI Conferences

Twenty-Second International Conference on Automated Planning and Scheduling

Country:

North America > United States > California > Monterey County > Monterey (0.04)
North America > Mexico (0.04)
Atlantic Ocean > Gulf of Mexico (0.04)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Soil Data Analysis Using Classification Techniques and Soil Attribute Prediction

Gholap, Jay, Ingole, Anurag, Gohil, Jayesh, Gargade, Shailesh, Attar, Vahida

arXiv.org Machine LearningJun-7-2012

Agricultural research has been profited by technical advances such as automation, data mining. Today, data mining is used in a vast areas and many off-the-shelf data mining system products and domain specific data mining application soft wares are available, but data mining in agricultural soil datasets is a relatively a young research field. The large amounts of data that are nowadays virtually harvested along with the crops have to be analyzed and should be used to their full extent. This research aims at analysis of soil dataset using data mining techniques. It focuses on classification of soil using various algorithms available. Another important purpose is to predict untested attributes using regression technique, and implementation of automated soil sample classification.

classification, data mining, machine learning, (14 more...)

arXiv.org Machine Learning

1206.1557

Country:

Asia > India > Maharashtra (0.15)
Oceania > New Zealand > North Island > Waikato (0.14)

Genre: Research Report (0.64)

Industry:

Food & Agriculture > Agriculture (1.00)
Government > Regional Government (0.70)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.76)
(2 more...)

Add feedback

The Artificial Regression Market

Lay, Nathan, Barbu, Adrian

arXiv.org Machine LearningApr-18-2012

The Artificial Prediction Market is a recent machine learning technique for multi-class classification, inspired from the financial markets. It involves a number of trained market participants that bet on the possible outcomes and are rewarded if they predict correctly. This paper generalizes the scope of the Artificial Prediction Markets to regression, where there are uncountably many possible outcomes and the error is usually the MSE. For that, we introduce the reward kernel that rewards each participant based on its prediction error and we derive the price equations. Using two reward kernels we obtain two different learning rules, one of which is approximated using Hermite-Gauss quadrature. The market setting makes it easy to aggregate specialized regressors that only predict when an observation falls into their specialization domain. Experiments show that regression markets based on the two learning rules outperform Random Forest Regression on many UCI datasets and are rarely outperformed.

artificial intelligence, machine learning, prediction market, (14 more...)

arXiv.org Machine Learning

1204.4154

Country:

North America > United States > Florida > Leon County > Tallahassee (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.83)

Industry: Banking & Finance > Trading (0.79)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.36)

Add feedback

Analysis of a Random Forests Model

Biau, Gérard

arXiv.org Machine LearningMar-26-2012

In a series of papers and technical reports, Breiman [9, 10, 11, 12] demonstrated that substantial gains in classification and regression accuracy can be achieved by using ensembles of trees, where each tree in the ensemble is grown in accordance with a random parameter. Final predictions are obtained by aggregating over the ensemble. As the base constituents of the ensemble are tree-structured predictors, and since each of these trees is constructed using an injection of randomness, these procedures are called "random forests". Breiman's ideas were decisively influenced by the early work of Amit and Geman [3] on geometric feature selection, the random subspace method of Ho [27] and the random split selection approach of Dietterich [21]. As highlighted by various empirical studies (see [11, 36, 20, 24, 25] for instance), random forests have emerged as serious competitors to state-of-the-art methods such as boosting (Freund [22]) and support vector machines (Shawe-Taylor and Cristianini [35]).

artificial intelligence, log 2, machine learning, (18 more...)

arXiv.org Machine Learning

1005.0208

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback

Feature Selection via Regularized Trees

Deng, Houtao, Runger, George

arXiv.org Machine LearningMar-21-2012

We propose a tree regularization framework, which enables many tree models to perform feature selection efficiently. The key idea of the regularization framework is to penalize selecting a new feature for splitting when its gain (e.g. information gain) is similar to the features used in previous splits. The regularization framework is applied on random forest and boosted trees here, and can be easily applied to other tree models. Experimental studies show that the regularized trees can select high-quality feature subsets with regard to both strong and weak classifiers. Because tree models can naturally deal with categorical and numerical variables, missing values, different scales between variables, interactions and nonlinearities etc., the tree regularization framework provides an effective and efficient feature selection solution for many practical problems.

artificial intelligence, machine learning, regularization framework, (18 more...)

arXiv.org Machine Learning

1201.1587

Country: North America > United States (0.68)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Risk Bounds for CART Classifiers under a Margin Condition

Gey, Servane

arXiv.org Machine LearningMar-1-2012

Risk bounds for Classification and Regression Trees (CART, Breiman et. al. 1984) classifiers are obtained under a margin condition in the binary supervised classification framework. These risk bounds are obtained conditionally on the construction of the maximal deep binary tree and permit to prove that the linear penalty used in the CART pruning algorithm is valid under a margin condition. It is also shown that, conditionally on the construction of the maximal tree, the final selection by test sample does not alter dramatically the estimation accuracy of the Bayes classifier. In the two-class classification framework, the risk bounds that are proved, obtained by using penalized model selection, validate the CART algorithm which is used in many data mining applications such as Biology, Medicine or Image Coding.

algorithm, artificial intelligence, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1016/j.patcog.2012.02.021

0902.3130

Country: Europe (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Information Forests

Yi, Zhao, Soatto, Stefano, Dewan, Maneesh, Zhan, Yiqiang

arXiv.org Machine LearningFeb-7-2012

We describe Information Forests, an approach to classification that generalizes Random Forests by replacing the splitting criterion of non-leaf nodes from a discriminative one -- based on the entropy of the label distribution -- to a generative one -- based on maximizing the information divergence between the class-conditional distributions in the resulting partitions. The basic idea consists of deferring classification until a measure of "classification confidence" is sufficiently high, and instead breaking down the data so as to maximize this measure. In an alternative interpretation, Information Forests attempt to partition the data into subsets that are "as informative as possible" for the purpose of the task, which is to classify the data. Classification confidence, or informative content of the subsets, is quantified by the Information Divergence. Our approach relates to active learning, semi-supervised learning, mixed generative/discriminative learning.

artificial intelligence, classification, machine learning, (16 more...)

arXiv.org Machine Learning

1202.1523

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.52)

Add feedback

Random Forests for Metric Learning with Implicit Pairwise Position Dependence

Xiong, Caiming, Johnson, David, Xu, Ran, Corso, Jason J.

arXiv.org Machine LearningJan-3-2012

Metric learning makes it plausible to learn distances for complex distributions of data from labeled data. However, to date, most metric learning methods are based on a single Mahalanobis metric, which cannot handle heterogeneous data well. Those that learn multiple metrics throughout the space have demonstrated superior accuracy, but at the cost of computational efficiency. Here, we take a new angle to the metric learning problem and learn a single metric that is able to implicitly adapt its distance function throughout the feature space. This metric adaptation is accomplished by using a random forest-based classifier to underpin the distance function and incorporate both absolute pairwise position and standard relative position into the representation. We have implemented and tested our method against state of the art global and multi-metric methods on a variety of data sets. Overall, the proposed method outperforms both types of methods in terms of accuracy (consistently ranked first) and is an order of magnitude faster than state of the art multi-metric methods (16x faster in the worst case).

artificial intelligence, machine learning, metric, (14 more...)

arXiv.org Machine Learning

1201.061

Genre: Research Report > New Finding (0.46)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.64)

Add feedback

Variance Penalizing AdaBoost

Shivaswamy, Pannagadatta K., Jebara, Tony

Neural Information Processing SystemsDec-31-2011

This paper proposes a novel boosting algorithm called VadaBoost which is motivated by recent empirical Bernstein bounds. VadaBoost iteratively minimizes a cost function that balances the sample mean and the sample variance of the exponential loss. Each step of the proposed algorithm minimizes the cost efficiently by providing weighted data to a weak learner rather than requiring a brute force evaluation of all possible weak learners. Thus, the proposed algorithm solves a key limitation of previous empirical Bernstein boosting methods which required brute force enumeration of all possible weak learners. Experimental results confirm that the new algorithm achieves the performance improvements of EBBoost yet goes beyond decision stumps to handle any weak learner. Significant performance gains are obtained over AdaBoost for arbitrary weak learners including decision trees (CART).

artificial intelligence, decision tree learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.88)

Add feedback

Applying Fuzzy ID3 Decision Tree for Software Effort Estimation

Elyassami, Sanaa, Idri, Ali

arXiv.org Artificial IntelligenceNov-1-2011

Web Effort Estimation is a process of predicting the efforts and cost in terms of money, schedule and staff for any software project system. Many estimation models have been proposed over the last three decades and it is believed that it is a must for the purpose of: Budgeting, risk analysis, project planning and control, and project improvement investment analysis. In this paper, we investigate the use of Fuzzy ID3 decision tree for software cost estimation; it is designed by integrating the principles of ID3 decision tree and the fuzzy set-theoretic concepts, enabling the model to handle uncertain and imprecise data when describing the software projects, which can improve greatly the accuracy of obtained estimates. MMRE and Pred are used as measures of prediction accuracy for this study. A series of experiments is reported using two different software projects datasets namely, Tukutuku and COCOMO'81 datasets. The results are compared with those produced by the crisp version of the ID3 decision tree.

artificial intelligence, fuzzy logic, machine learning, (13 more...)

arXiv.org Artificial Intelligence

1111.0158

Country:

Africa > Middle East > Morocco > Rabat-Salé-Kénitra Region > Rabat (0.05)
North America > United States > New York (0.04)
North America > United States > New Jersey > Atlantic County > Atlantic City (0.04)
(7 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.52)

Add feedback