AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Stability of Random Forests and Coverage of Random-Forest Prediction Intervals

Neural Information Processing SystemsJan-18-2025, 21:36:37 GMT

We establish stability of random forests under the mild condition that the squared response ( Y 2) does not have a heavy tail. In particular, our analysis holds for the practical version of random forests that is implemented in popular packages like \texttt{randomForest} in \texttt{R}. Empirical results show that stability may persist even beyond our assumption and hold for heavy-tailed Y 2 . Using the stability property, we prove a non-asymptotic lower bound for the coverage probability of prediction intervals constructed from the out-of-bag error of random forests. With another mild condition that is typically satisfied when Y is continuous, we also establish a complementary upper bound, which can be similarly established for the jackknife prediction interval constructed from an arbitrary stable algorithm.

coverage probability, random forest and coverage, random-forest prediction interval, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Depth is More Powerful than Width with Prediction Concatenation in Deep Forest

Neural Information Processing SystemsJan-18-2025, 18:24:17 GMT

Random Forest (RF) is an ensemble learning algorithm proposed by \citet{breiman2001random} that constructs a large number of randomized decision trees individually and aggregates their predictions by naive averaging. The prediction concatenation (PreConc) operation is crucial for the multi-layer feature transformation in deep forest, though little has been known about its theoretical property. In this paper, we analyze the influence of Preconc on the consistency of deep forest. Especially when the individual tree is inconsistent (as in practice, the individual tree is often set to be fully grown, i.e., there is only one sample at each leaf node), we find that the convergence rate of two-layer DF \textit{w.r.t.} the number of trees M can reach \mathcal{O}(1/M 2) under some mild conditions, while the convergence rate of RF is \mathcal{O}(1/M) . Therefore, with the help of PreConc, DF with deeper layer will be more powerful than the shallower layer.

deep forest, multi-layer feature transformation, prediction concatenation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)

Add feedback

On Computing Probabilistic Explanations for Decision Trees

Neural Information Processing SystemsJan-18-2025, 16:09:57 GMT

Formal XAI (explainable AI) is a growing area that focuses on computing explanations with mathematical guarantees for the decisions made by ML models. Inside formal XAI, one of the most studied cases is that of explaining the choices taken by decision trees, as they are traditionally deemed as one of the most interpretable classes of models. Recent work has focused on studying the computation of sufficient reasons, a kind of explanation in which given a decision tree T and an instance x, one explains the decision T(x) by providing a subset y of the features of x such that for any other instance z compatible with y, it holds that T(z) T(x), intuitively meaning that the features in y are already enough to fully justify the classification of x by T . It has been argued, however, that sufficient reasons constitute a restrictive notion of explanation. For such a reason, the community has started to study their probabilistic counterpart, in which one requires that the probability of T(z) T(x) must be at least some value \delta \in (0, 1], where z is a random instance that is compatible with y .

computing probabilistic explanation, decision tree, delta -sufficient-reason, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.94)

Add feedback

SketchBoost: Fast Gradient Boosted Decision Tree for Multioutput Problems

Neural Information Processing SystemsJan-18-2025, 07:38:20 GMT

Gradient Boosted Decision Tree (GBDT) is a widely-used machine learning algorithm that has been shown to achieve state-of-the-art results on many standard data science problems. We are interested in its application to multioutput problems when the output is highly multidimensional. Although there are highly effective GBDT implementations, their scalability to such problems is still unsatisfactory. In this paper, we propose novel methods aiming to accelerate the training process of GBDT in the multioutput scenario. The idea behind these methods lies in the approximate computation of a scoring function used to find the best split of decision trees.

fast gradient boosted decision tree, gradient boosted decision tree, sketchboost, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.66)

Add feedback

Positive-Unlabeled Learning using Random Forests via Recursive Greedy Risk Minimization

Neural Information Processing SystemsJan-18-2025, 01:36:30 GMT

The need to learn from positive and unlabeled data, or PU learning, arises in many applications and has attracted increasing interest. While random forests are known to perform well on many tasks with positive and negative data, recent PU algorithms are generally based on deep neural networks, and the potential of tree-based PU learning is under-explored. In this paper, we propose new random forest algorithms for PU-learning. Key to our approach is a new interpretation of decision tree algorithms for positive and negative data as \emph{recursive greedy risk minimization algorithms}. We extend this perspective to the PU setting to develop new decision tree learning algorithms that directly minimizes PU-data based estimators for the expected risk.

algorithm, positive-unlabeled learning, recursive greedy risk minimization, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Comparing hundreds of machine learning classifiers and discrete choice models in predicting travel behavior: an empirical benchmark

Wang, Shenhao, Mo, Baichuan, Zheng, Yunhan, Hess, Stephane, Zhao, Jinhua

arXiv.org Artificial IntelligenceJan-17-2025

Numerous studies have compared machine learning (ML) and discrete choice models (DCMs) in predicting travel demand. However, these studies often lack generalizability as they compare models deterministically without considering contextual variations. To address this limitation, our study develops an empirical benchmark by designing a tournament model, thus efficiently summarizing a large number of experiments, quantifying the randomness in model comparisons, and using formal statistical tests to differentiate between the model and contextual effects. This benchmark study compares two large-scale data sources: a database compiled from literature review summarizing 136 experiments from 35 studies, and our own experiment data, encompassing a total of 6,970 experiments from 105 models and 12 model families. This benchmark study yields two key findings. Firstly, many ML models, particularly the ensemble methods and deep learning, statistically outperform the DCM family (i.e., multinomial, nested, and mixed logit models). However, this study also highlights the crucial role of the contextual factors (i.e., data sources, inputs and choice categories), which can explain models' predictive performance more effectively than the differences in model types alone. Model performance varies significantly with data sources, improving with larger sample sizes and lower dimensional alternative sets. After controlling all the model and contextual factors, significant randomness still remains, implying inherent uncertainty in such model comparisons. Overall, we suggest that future researchers shift more focus from context-specific model comparisons towards examining model transferability across contexts and characterizing the inherent uncertainty in ML, thus creating more robust and generalizable next-generation travel demand models.

artificial intelligence, machine learning, model family, (19 more...)

arXiv.org Artificial Intelligence

2102.0113

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

Add feedback

Model Class Reliance for Random Forests

Neural Information Processing SystemsJan-16-2025, 22:58:14 GMT

Variable Importance (VI) has traditionally been cast as the process of estimating each variables contribution to a predictive model's overall performance. Recent research has sought to address this concern via analysis of Rashomon sets - sets of alternative model instances that exhibit equivalent predictive performance to some reference model, but which take different functional forms. Measures such as Model Class Reliance (MCR) have been proposed, that are computed against Rashomon sets, in order to ascertain how much a variable must be relied on to make robust predictions, or whether alternatives exist. If MCR range is tight, we have no choice but to use a variable; if range is high then there exists competing, perhaps fairer models, that provide alternative explanations of the phenomena being examined. Applications are wide, from enabling construction of fairer' models in areas such as recidivism, health analytics and ethical marketing.

estimation, model class reliance, random forest, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.44)

Add feedback

SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules

Lamprinakou, Stamatina, Sang, Huiyan, Konomi, Bledar A., Lu, Ligang

arXiv.org Machine LearningJan-16-2025

Bayesian Additive Regression Trees [BART, Chipman et al., 2010] have gained significant popularity due to their remarkable predictive performance and ability to quantify uncertainty. However, standard decision tree models rely on recursive data splits at each decision node, using deterministic decision rules based on a single univariate feature. This approach limits their ability to effectively capture complex decision boundaries, particularly in scenarios involving multiple features, such as spatial domains, or when transitions are either sharp or smoothly varying. In this paper, we introduce a novel probabilistic additive decision tree model that employs a soft split rule. This method enables highly flexible splits that leverage both univariate and multivariate features, while also respecting the geometric properties of the feature domain. Notably, the probabilistic split rule adapts dynamically across decision nodes, allowing the model to account for varying levels of smoothness in the regression function. We demonstrate the utility of the proposed model through comparisons with existing tree-based models on synthetic datasets and a New York City education dataset.

artificial intelligence, conditional distribution, machine learning, (17 more...)

arXiv.org Machine Learning

2501.099

Country:

North America > United States > Texas (0.28)
North America > United States > New York (0.24)

Genre: Research Report (0.81)

Industry:

Education > Educational Setting (0.68)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Utilizing AI Language Models to Identify Prognostic Factors for Coronary Artery Disease: A Study in Mashhad Residents

Zahra, Bami, Nasser, Behnampour, Hassan, Doosti, Majid, Ghayour Mobarhan

arXiv.org Artificial IntelligenceJan-16-2025

Abstract: Background: Understanding cardiovascular artery disease risk factors, the leading global cause of mortality, is crucial for influencing its etiology, prevalence, and treatment. This study aims to evaluate prognostic markers for coronary artery disease in Mashhad using Naive Bayes, REP Tree, J48, CART, and CHAID algorithms. Methods: Using data from the 2009 MASHAD STUDY, prognostic factors for coronary artery disease were determined with Naive Bayes, REP Tree, J48, CART, CHAID, and Random Forest algorithms using R 3.5.3 and WEKA 3.9.4. Model efficiency was compared by sensitivity, specificity, and accuracy. Cases were patients with coronary artery disease; each had three controls (totally 940). Results: Prognostic factors for coronary artery disease in Mashhad residents varied by algorithm. CHAID identified age, myocardial infarction history, and hypertension. CART included depression score and physical activity. REP added education level and anxiety score. NB included diabetes and family history. J48 highlighted father's heart disease and weight loss. CHAID had the highest accuracy (0.80). Conclusion: Key prognostic factors for coronary artery disease in CART and CHAID models include age, myocardial infarction history, hypertension, depression score, physical activity, and BMI. NB, REP Tree, and J48 identified numerous factors. CHAID had the highest accuracy, sensitivity, and specificity. CART offers simpler interpretation, aiding physician and paramedic model selection based on specific. Keywords: RF, Na\"ive Bayes, REP, J48 algorithms, Coronary Artery Disease (CAD).

algorithm, coronary artery disease, history, (13 more...)

arXiv.org Artificial Intelligence

2501.0948

Country:

Asia > Middle East > Iran > Razavi Khorasan Province > Mashhad (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.49)
Research Report > New Finding (0.30)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

Add feedback

Shape-Based Single Object Classification Using Ensemble Method Classifiers

Kamarudin, Nur Shazwani, Makhtar, Mokhairi, Shamsuddin, Syadiah Nor Wan, Fadzli, Syed Abdullah

arXiv.org Artificial IntelligenceJan-16-2025

Nowadays, more and more images are available. Annotation and retrieval of the images pose classification problems, where each class is defined as the group of database images labelled with a common semantic label. Various systems have been proposed for content-based retrieval, as well as for image classification and indexing. In this paper, a hierarchical classification framework has been proposed for bridging the semantic gap effectively and achieving multi-category image classification. A well known pre-processing and post-processing method was used and applied to three problems; image segmentation, object identification and image classification. The method was applied to classify single object images from Amazon and Google datasets. The classification was tested for four different classifiers; BayesNetwork (BN), Random Forest (RF), Bagging and Vote. The estimated classification accuracies ranged from 20% to 99% (using 10-fold cross validation). The Bagging classifier presents the best performance, followed by the Random Forest classifier.

classification, classifier, dataset, (13 more...)

arXiv.org Artificial Intelligence

2501.09311

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > New Zealand > North Island > Waikato (0.05)
North America > United States > New Jersey (0.04)
Asia > Malaysia (0.04)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.57)

Add feedback