AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Random Forests: some methodological insights

Genuer, Robin, Poggi, Jean-Michel, Tuleau, Christine

arXiv.org Machine LearningNov-21-2008

This paper examines from an experimental perspective random forests, the increasingly used statistical method for classification and regression problems introduced by Leo Breiman in 2001. It first aims at confirming, known but sparse, advice for using random forests and at proposing some complementary remarks for both standard problems as well as high dimensional ones for which the number of variables hugely exceeds the sample size. But the main contribution of this paper is twofold: to provide some insights about the behavior of the variable importance index based on random forests and in addition, to propose to investigate two classical issues of variable selection. The first one is to find important variables for interpretation and the second one is more restrictive and try to design a good prediction model. The strategy involves a ranking of explanatory variables using the random forests score of importance and a stepwise ascending variable introduction strategy.

artificial intelligence, machine learning, procedure, (18 more...)

arXiv.org Machine Learning

0811.3619

Country:

North America > United States (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Anytime Induction of Low-cost, Low-error Classifiers: a Sampling-based Approach

Esmeir, S., Markovitch, S.

Journal of Artificial Intelligence ResearchSep-17-2008

Machine learning techniques are gaining prevalence in the production of a wide range of classifiers for complex real-world applications with nonuniform testing and misclassification costs. The increasing complexity of these applications poses a real challenge to resource management during learning and classification. In this work we introduce ACT (anytime cost-sensitive tree learner), a novel framework for operating in such complex environments. ACT is an anytime algorithm that allows learning time to be increased in return for lower classification costs. It builds a tree top-down and exploits additional time resources to obtain better estimations for the utility of the different candidate splits. Using sampling techniques, ACT approximates the cost of the subtree under each candidate split and favors the one with a minimal cost. As a stochastic algorithm, ACT is expected to be able to escape local minima, into which greedy methods may be trapped. Experiments with a variety of datasets were conducted to compare ACT to the state-of-the-art cost-sensitive tree learners. The results show that for the majority of domains ACT produces significantly less costly trees. ACT also exhibits good anytime behavior with diminishing returns.

algorithm, dataset, misclassification cost, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2602

AI Access Foundation

10567

Journal of Artificial Intelligence Research

Country:

North America > United States > Washington > King County > Seattle (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(19 more...)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

A Too-Clever Ranking Method

Oates, Tim (University of Maryland Baltimore County)

AI MagazineJun-15-2008

I developed what I scored, and those with the lowest scores could be removed before running C4.5 to build a decision tree with the remainder. I ran an experiment in which I removed the bottom 10 percent of the instances in a University of California, Irvine (UCI) data set. The resulting tree was smaller and more accurate (as measured by 10-fold CV) than the tree built on the full data set. Then I removed the bottom 20 percent of the instances and got a tree that was smaller than the last one and just as accurate. At that point I had the feeling that this was going to make a great paper for the International Conference on Machine Learning (ICML).

artificial intelligence, machine learning, too-clever ranking method, (6 more...)

AI Magazine

Country: North America > United States > California > Orange County > Irvine (0.26)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.60)

Add feedback

Spectrum of Variable-Random Trees

Liu, F. T., Ting, K. M., Yu, Y., Zhou, Z. H.

Journal of Artificial Intelligence ResearchMay-29-2008

In this paper, we show that a continuous spectrum of randomisation exists, in which most existing tree randomisations are only operating around the two ends of the spectrum. That leaves a huge part of the spectrum largely unexplored. We propose a base learner VR-Tree which generates trees with variable-randomness. VR-Trees are able to span from the conventional deterministic trees to the complete-random trees using a probabilistic parameter. Using VR-Trees as the base models, we explore the entire spectrum of randomised ensembles, together with Bagging and Random Subspace. We discover that the two halves of the spectrum have their distinct characteristics; and the understanding of which allows us to propose a new approach in building better decision tree ensembles. We name this approach Coalescence, which coalesces a number of points in the random-half of the spectrum. Coalescence acts as a committee of ``experts'' to cater for unforeseeable conditions presented in training data. Coalescence is found to perform better than any single operating point in the spectrum, without the need to tune to a specific level of randomness. In our empirical study, Coalescence ranks top among the benchmarking ensemble methods including Random Forests, Random Subspace and C5 Boosting; and only Coalescence is significantly better than Bagging and Max-Diverse Ensemble among all the methods in the comparison. Although Coalescence is not significantly better than Random Forests, we have identified conditions under which one will perform better than the other.

ensemble, random forest, vr-tree, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2470

AI Access Foundation

10550

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
(8 more...)

Genre:

Research Report (0.68)
Overview (0.46)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Fast Discriminative Visual Codebooks using Randomized Clustering Forests

Moosmann, Frank, Triggs, Bill, Jurie, Frederic

Neural Information Processing SystemsDec-31-2007

Large numbers of descriptors and large codebooks are needed for good results and this becomes slow using k-means. We introduce Extremely Randomized Clustering Forests - ensembles of randomly created clustering trees - and show that these provide more accurate results, much faster training and testing and good resistance to background clutter in several state-of-the-art image classification tasks.

classification, descriptor, randomized clustering forest, (15 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Styria > Graz (0.05)
North America > United States > Wisconsin (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Europe > France (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Fast Discriminative Visual Codebooks using Randomized Clustering Forests

Moosmann, Frank, Triggs, Bill, Jurie, Frederic

Neural Information Processing SystemsDec-31-2007

classification, descriptor, randomized clustering forest, (15 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Styria > Graz (0.05)
North America > United States > Wisconsin (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Europe > France (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Fast Discriminative Visual Codebooks using Randomized Clustering Forests

Moosmann, Frank, Triggs, Bill, Jurie, Frederic

Neural Information Processing SystemsDec-31-2007

artificial intelligence, descriptor, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text

Mairesse, F., Walker, M. A., Mehl, M. R., Moore, R. K.

Journal of Artificial Intelligence ResearchNov-28-2007

It is well known that utterances convey a great deal of information about the speaker in addition to their semantic content. One such type of information consists of cues to the speaker's personality traits, the most fundamental dimension of variation between humans. Recent work explores the automatic detection of other types of pragmatic variation in text and conversation, such as emotion, deception, speaker charisma, dominance, point of view, subjectivity, opinion and sentiment. Personality affects these other aspects of linguistic production, and thus personality recognition may be useful for these tasks, in addition to many other potential applications. However, to date, there is little work on the automatic recognition of personality traits. This article reports experimental results for recognition of all Big Five personality traits, in both conversation and text, utilising both self and observer ratings of personality. While other work reports classification results, we experiment with classification, regression and ranking models. For each model, we analyse the effect of different feature sets on accuracy. Results show that for some traits, any type of statistical model performs significantly better than the baseline, but ranking models perform best overall. We also present an experiment suggesting that ranking models are more accurate than multi-class classifiers for modelling personality. In addition, recognition models trained on observed personality perform better than models trained using self-reports, and the optimal feature set depends on the personality trait. A qualitative analysis of the learned models confirms previous findings linking language and personality, while revealing many new linguistic markers.

corpus, extraversion, personality, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2349

AI Access Foundation

10520

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
(2 more...)

Add feedback

A Comparison of Different Machine Transliteration Models

Oh, J., Choi, K., Isahara, H.

Journal of Artificial Intelligence ResearchOct-18-2006

Machine transliteration is a method for automatically converting words in one language into phonetically equivalent ones in another language. Machine transliteration plays an important role in natural language applications such as information retrieval and machine translation, especially for handling proper nouns and technical terms. Four machine transliteration models -- grapheme-based transliteration model, phoneme-based transliteration model, hybrid transliteration model, and correspondence-based transliteration model -- have been proposed by several researchers. To date, however, there has been little research on a framework in which multiple transliteration models can operate simultaneously. Furthermore, there has been no comparison of the four models within the same framework and using the same data. We addressed these problems by 1) modeling the four models within the same framework, 2) comparing them under the same conditions, and 3) developing a way to improve machine transliteration through this comparison. Our comparison showed that the hybrid and correspondence-based models were the most effective and that the four models can be used in a complementary manner to improve machine transliteration performance.

grapheme, transliteration, transliteration model, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1999

AI Access Foundation

10468

Journal of Artificial Intelligence Research

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)

Add feedback

Using Random Forests in the Structured Language Model

Xu, Peng, Jelinek, Frederick

Neural Information Processing SystemsDec-31-2005

In this paper, we explore the use of Random Forests (RFs) in the structured language model (SLM), which uses rich syntactic information in predicting the next word based on words already seen. The goal in this work is to construct RFs by randomly growing Decision Trees (DTs) using syntactic information and investigate the performance of the SLM modeled by the RFs in automatic speech recognition. RFs, which were originally developed as classifiers, are a combination of decision tree classifiers. Each tree is grown based on random training data sampled independently and with the same distribution for all trees in the forest, and a random selection of possible questions at each node of the decision tree. Our approach extends the original idea of RFs to deal with the data sparseness problem encountered in language modeling. RFs have been studied in the context of n-gram language modeling and have been shown to generalize well to unseen data. We show in this paper that RFs using syntactic information can also achieve better performance in both perplexity (PPL) and word error rate (WER) in a large vocabulary speech recognition system, compared to a baseline that uses Kneser-Ney smoothing.

language model, probability, random forest, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback