AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Decision Forest: A Nonparametric Approach to Modeling Irrational Choice

Chen, Yi-Chun, Mišić, Velibor V.

arXiv.org Machine LearningApr-25-2019

Customer behavior is often assumed to follow weak rationality, which implies that adding a product to an assortment will not increase the choice probability of another product in that assortment. However, an increasing amount of research has revealed that customers are not necessarily rational when making decisions. In this paper, we study a new nonparametric choice model that relaxes this assumption and can model a wider range of customer behavior, such as decoy effects between products. In this model, each customer type is associated with a binary decision tree, which represents a decision process for making a purchase based on checking for the existence of specific products in the assortment. Together with a probability distribution over customer types, we show that the resulting model -- a decision forest -- is able to represent any customer choice model, including models that are inconsistent with weak rationality. We theoretically characterize the depth of the forest needed to fit a data set of historical assortments and prove that asymptotically, a forest whose depth scales logarithmically in the number of assortments is sufficient to fit most data sets. We also propose an efficient algorithm for estimating such models from data, based on combining randomization and optimization. Using synthetic data and real transaction data exhibiting non-rational behavior, we show that the model outperforms the multinomial logit and ranking-based models in out-of-sample predictive ability.

artificial intelligence, assortment, machine learning, (14 more...)

arXiv.org Machine Learning

1904.11532

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Regression-Enhanced Random Forests

Zhang, Haozhe, Nettleton, Dan, Zhu, Zhengyuan

arXiv.org Machine LearningApr-23-2019

In the last few years, there have been many methodological and theoretical advances in the random forests approach. Some methodological developments and extensions include case-specific random forests [19], multivariate random forests [16], quantile regression forests [13], random survival forests [11], enriched random forests for microarry data [1] and predictor augmentation in random forests [18] among others. For theoretical developments, the statistical and asymptotic properties of random forests have been intensively investigated. Advances have been made in the areas such as consistency [2] [15], variable selection [8] and the construction of confidence intervals [17]. Although RF methodology has proven itself to be a reliable predictive approach in many application areas [3][10], there are some cases where random forests may suffer. First, as a fully nonparametric predictive algorithm, random forests may not efficiently incorporate known relationships between the response and the predictors. Second, random forests may fail in extrapolation problems where predictions are required at points out of the domain of the training dataset. For regression problems, a random forest prediction is an average of the predictions produced by the trees in the forest.

artificial intelligence, machine learning, random forest, (18 more...)

arXiv.org Machine Learning

1904.10416

Country: North America > United States > Iowa (0.17)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Random Pairwise Shapelets Forest

Shi, Mohan, Wang, Zhihai, Yuan, Jodong, Liu, Haiyang

arXiv.org Machine LearningApr-22-2019

Shapelet is a discriminative subsequence of time series. An advanced shapelet-based method is to embed shapelet into accurate and fast random forest. However, it shows several limitations. First, random shapelet forest requires a large training cost for split threshold searching. Second, a single shapelet provides limited information for only one branch of the decision tree, resulting in insufficient accuracy and interpretability. Third, randomized ensemble causes interpretability declining. For that, this paper presents Random Pairwise Shapelets Forest (RPSF). RPSF combines a pair of shapelets from different classes to construct random forest. It omits threshold searching to be more efficient, includes more information for each node of the forest to be more effective. Moreover, a discriminability metric, Decomposed Mean Decrease Impurity (DMDI), is proposed to identify influential region for every class. Extensive experiments show RPSF improves the accuracy and training speed of shapelet-based forest. Case studies demonstrate the interpretability of our method.

decision tree learning, machine learning, random pairwise shapelet forest, (1 more...)

arXiv.org Machine Learning

1903.07799

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.73)

Add feedback

Integrating Association Rules with Decision Trees in Object-Relational Databases

Ayyagari, Maruthi Rohit

arXiv.org Artificial IntelligenceApr-21-2019

Research has provided evidence that associative classification produces more accurate results compared to other classification models. The Classification Based on Association (CBA) is one of the famous Associative Classification algorithms that generates accurate classifiers. However, current association classification algorithms reside external to databases, which reduces the flexibility of enterprise analytics systems. This paper implements the CBA in Oracle database using two variant models: hardcoding the CBA in Oracle Data Mining (ODM) package and Integrating Oracle Apriori model with the Oracle Decision tree model. We compared the proposed model performance with Naive Bayes, Support Vector Machine, Random Forests, and Decision Tree over 18 datasets from UCI. Results showed that our models outperformed the original CBA model with 1 percent and is competitive to chosen classification models over benchmark datasets.

artificial intelligence, association rule, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.14445/22312803/IJCTT-V67I3P120

1904.09654

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.55)

Industry: Health & Medicine > Therapeutic Area (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Continuous-Time Birth-Death MCMC for Bayesian Regression Tree Models

Mohammadi, Reza, Pratola, Matthew, Kaptein, Maurits

arXiv.org Machine LearningApr-19-2019

Decision trees are flexible models that are well suited for many statistical regression problems. In a Bayesian framework for regression trees, Markov Chain Monte Carlo (MCMC) search algorithms are required to generate samples of tree models according to their posterior probabilities. The critical component of such an MCMC algorithm is to construct good Metropolis-Hastings steps for updating the tree topology. However, such algorithms frequently suffering from local mode stickiness and poor mixing. As a result, the algorithms are slow to converge. Hitherto, authors have primarily used discrete-time birth/death mechanisms for Bayesian (sums of) regression tree models to explore the model space. These algorithms are efficient only if the acceptance rate is high which is not always the case. Here we overcome this issue by developing a new search algorithm which is based on a continuous-time birth-death Markov process. This search algorithm explores the model space by jumping between parameter spaces corresponding to different tree structures. In the proposed algorithm, the moves between models are always accepted which can dramatically improve the convergence and mixing properties of the MCMC algorithm. We provide theoretical support of the algorithm for Bayesian regression tree models and demonstrate its performance.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

1904.09339

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
(2 more...)

Add feedback

Visualizing the decision-making process in deep neural decision forest

Li, Shichao, Cheng, Kwang-Ting

arXiv.org Artificial IntelligenceApr-19-2019

Deep neural decision forest (NDF) achieved remarkable performance on various vision tasks via combining decision tree and deep representation learning. In this work, we first trace the decision-making process of this model and visualize saliency maps to understand which portion of the input influence it more for both classification and regression problems. We then apply NDF on a multi-task coordinate regression problem and demonstrate the distribution of routing probabilities, which is vital for interpreting NDF yet not shown for regression problems. The pre-trained model and code for visualization will be available at https://github.com/Nicholasli1995/VisualizingNDF

artificial intelligence, decision-making process, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1904.09201

Country: Asia > China > Hong Kong (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Scalable and Efficient Hypothesis Testing with Random Forests

Coleman, Tim, Peng, Wei, Mentch, Lucas

arXiv.org Machine LearningApr-16-2019

Throughout the last decade, random forests have established themselves as among the most accurate and popular supervised learning methods. While their black-box nature has made their mathematical analysis difficult, recent work has established important statistical properties like consistency and asymptotic normality by considering subsampling in lieu of bootstrapping. Though such results open the door to traditional inference procedures, all formal methods suggested thus far place severe restrictions on the testing framework and their computational overhead precludes their practical scientific use. Here we propose a permutation-style testing approach to formally assess feature significance. We establish asymptotic validity of the test via exchangeability arguments and show that the test maintains high power with orders of magnitude fewer computations. As importantly, the procedure scales easily to big data settings where large training and testing sets may be employed without the need to construct additional models. Simulations and applications to ecological data where random forests have recently shown promise are provided.

artificial intelligence, machine learning, random forest, (19 more...)

arXiv.org Machine Learning

1904.0783

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

Learning Optimal Decision Trees from Large Datasets

Avellaneda, Florent

arXiv.org Machine LearningApr-12-2019

Inferring a decision tree from a given dataset is one of the classic problems in machine learning. This problem consists of buildings, from a labelled dataset, a tree such that each node corresponds to a class and a path between the tree root and a leaf corresponds to a conjunction of features to be satisfied in this class. Following the principle of parsimony, we want to infer a minimal tree consistent with the dataset. Unfortunately, inferring an optimal decision tree is known to be NP-complete for several definitions of optimality. Hence, the majority of existing approaches relies on heuristics, and as for the few exact inference approaches, they do not work on large data sets. In this paper, we propose a novel approach for inferring a decision tree of a minimum depth based on the incremental generation of Boolean formula. The experimental results indicate that it scales sufficiently well and the time it takes to run grows slowly with the size of dataset.

artificial intelligence, decision tree, machine learning, (16 more...)

arXiv.org Machine Learning

1904.06314

Country:

North America > Canada (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Quantifying Interpretability of Arbitrary Machine Learning Models Through Functional Decomposition

Molnar, Christoph, Casalicchio, Giuseppe, Bischl, Bernd

arXiv.org Machine LearningApr-8-2019

To obtain interpretable machine learning models, either interpretable models are constructed from the outset - e.g. shallow decision trees, rule lists, or sparse generalized linear models - or post-hoc interpretation methods - e.g. partial dependence or ALE plots - are employed. Both approaches have disadvantages. While the former can restrict the hypothesis space too conservatively, leading to potentially suboptimal solutions, the latter can produce too verbose or misleading results if the resulting model is too complex, especially w.r.t. feature interactions. We propose to make the compromise between predictive power and interpretability explicit by quantifying the complexity / interpretability of machine learning models. Based on functional decomposition, we propose measures of number of features used, interaction strength and main effect complexity. We show that post-hoc interpretation of models that minimize the three measures becomes more reliable and compact. Furthermore, we demonstrate the application of such measures in a multi-objective optimization approach which considers predictive power and interpretability at the same time.

artificial intelligence, interpretability, machine learning, (17 more...)

arXiv.org Machine Learning

1904.03867

Country:

Europe > Germany (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Unveiling phase transitions with machine learning

Canabarro, Askery, Fanchini, Felipe Fernandes, Malvezzi, André Luiz, Pereira, Rodrigo, Chaves, Rafael

arXiv.org Machine LearningApr-2-2019

The classification of phase transitions is a central and challenging task in condensed matter physics. Typically, it relies on the identification of order parameters and the analysis of singularities in the free energy and its derivatives. Here, we propose an alternative framework to identify quantum phase transitions, employing both unsupervised and supervised machine learning techniques. Using the axial next-nearest neighbor Ising (ANNNI) model as a benchmark, we show how unsupervised learning can detect three phases (ferromagnetic, paramagnetic, and a cluster of the antiphase with the floating phase) as well as two distinct regions within the paramagnetic phase. Employing supervised learning we show that transfer learning becomes possible: a machine trained only with nearest-neighbour interactions can learn to identify a new type of phase occurring when next-nearest-neighbour interactions are introduced. All our results rely on few and low dimensional input data (up to twelve lattice sites), thus providing a computational friendly and general framework for the study of phase transitions in many-body systems.

artificial intelligence, machine learning, transition, (19 more...)

arXiv.org Machine Learning

1904.01486

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.94)
(2 more...)

Add feedback