AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Neural Networks as Decision Trees

#artificialintelligenceApr-3-2023, 18:25:32 GMT

The recent boom in AI has clearly shown the power of deep neural networks in various tasks, especially in the field of classification problems where the data is high-dimensional and has complex, non-linear relationships with the target variables. However, explaining the decisions of any neural classifier is an incredibly hard problem. While many post-hoc methods such as DeepLift [2] and Layer-Wise Relevance Propagation [3] can help with explaining individual decisions, explaining the global decision mechanisms (or what the model generally looks for) is much more difficult. Because of this, many practitioners in high-stakes fields instead opt for more interpretable models like basic Decision Trees since the decision hierarchy can be clearly visualized and understood by stakeholders. However, basic trees by themselves often do not provide enough accuracy for the task at hand and often ensemble methods like Bagging or Boosting are used to improve the model's performance.

decision tree, node, probability, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.98)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.74)

Add feedback

From Conception to Deployment: Intelligent Stroke Prediction Framework using Machine Learning and Performance Evaluation

Ismail, Leila, Materwala, Huned

arXiv.org Artificial IntelligenceApr-1-2023

Stroke is the second leading cause of death worldwide. Machine learning classification algorithms have been widely adopted for stroke prediction. However, these algorithms were evaluated using different datasets and evaluation metrics. Moreover, there is no comprehensive framework for stroke data analytics. This paper proposes an intelligent stroke prediction framework based on a critical examination of machine learning prediction algorithms in the literature. The five most used machine learning algorithms for stroke prediction are evaluated using a unified setup for objective comparison. Comparative analysis and numerical results reveal that the Random Forest algorithm is best suited for stroke prediction.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/COINS54846.2022.9854961

2304.00249

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Asia > Thailand (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.94)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)

Add feedback

Four Factors to consider when choosing b/w Decision Tree and Random Forest

#artificialintelligenceMar-30-2023, 04:05:14 GMT

The decision to choose between Random Forest and Decision Tree models depends on the complexity of the problem, the size of the dataset, the interpretability of the model, and the trade-off between accuracy and computational efficiency. Complexity of the problem: Decision trees are simpler and easier to interpret, making them a good choice for smaller and less complex problems. However, for larger and more complex problems, Random Forest models can provide better accuracy due to their ability to combine multiple decision trees. Size of the dataset: Decision trees can be sensitive to noise and outliers, and may overfit the data if the dataset is too small. Random Forest models can be more robust to noise and overfitting, making them a good choice for smaller datasets.

artificial intelligence, machine learning, random forest model, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Lifting uniform learners via distributional decomposition

Blanc, Guy, Lange, Jane, Malik, Ali, Tan, Li-Yang

arXiv.org Artificial IntelligenceMar-29-2023

We show how any PAC learning algorithm that works under the uniform distribution can be transformed, in a blackbox fashion, into one that works under an arbitrary and unknown distribution $\mathcal{D}$. The efficiency of our transformation scales with the inherent complexity of $\mathcal{D}$, running in $\mathrm{poly}(n, (md)^d)$ time for distributions over $\{\pm 1\}^n$ whose pmfs are computed by depth-$d$ decision trees, where $m$ is the sample complexity of the original algorithm. For monotone distributions our transformation uses only samples from $\mathcal{D}$, and for general ones it uses subcube conditioning samples. A key technical ingredient is an algorithm which, given the aforementioned access to $\mathcal{D}$, produces an optimal decision tree decomposition of $\mathcal{D}$: an approximation of $\mathcal{D}$ as a mixture of uniform distributions over disjoint subcubes. With this decomposition in hand, we run the uniform-distribution learner on each subcube and combine the hypotheses using the decision tree. This algorithmic decomposition lemma also yields new algorithms for learning decision tree distributions with runtimes that exponentially improve on the prior state of the art -- results of independent interest in distribution learning.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.16208

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany (0.04)
Africa > Sudan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Local Interpretability of Random Forests for Multi-Target Regression

Bardos, Avraam, Mylonas, Nikolaos, Mollas, Ioannis, Tsoumakas, Grigorios

arXiv.org Artificial IntelligenceMar-29-2023

Multi-target regression is useful in a plethora of applications. Although random forest models perform well in these tasks, they are often difficult to interpret. Interpretability is crucial in machine learning, especially when it can directly impact human well-being. Although model-agnostic techniques exist for multi-target regression, specific techniques tailored to random forest models are not available. To address this issue, we propose a technique that provides rule-based interpretations for instances made by a random forest model for multi-target regression, influenced by a recent model-specific technique for random forest interpretability. The proposed technique was evaluated through extensive experiments and shown to offer competitive interpretations compared to state-of-the-art techniques.

artificial intelligence, machine learning, multi-target regression, (18 more...)

arXiv.org Artificial Intelligence

2303.16506

Country: Europe > Greece > Central Macedonia > Thessaloniki (0.05)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Three-way causal attribute partial order structure analysis

Zaifa, Xue, Huibin, Lu, Tao, Zhang, Tao, Li, Xin, Lu

arXiv.org Artificial IntelligenceMar-28-2023

As an emerging concept cognitive learning model, partial order formal structure analysis (POFSA) has been widely used in the field of knowledge processing. In this paper, we propose the method named three-way causal attribute partial order structure (3WCAPOS) to evolve the POFSA from set coverage to causal coverage in order to increase the interpretability and classification performance of the model. First, the concept of causal factor (CF) is proposed to evaluate the causal correlation between attributes and decision attributes in the formal decision context. Then, combining CF with attribute partial order structure, the concept of causal attribute partial order structure is defined and makes set coverage evolve into causal coverage. Finally, combined with the idea of three-way decision, 3WCAPOS is formed, which makes the purity of nodes in the structure clearer and the changes between levels more obviously. In addition, the experiments are carried out from the classification ability and the interpretability of the structure through the six datasets. Through these experiments, it is concluded the accuracy of 3WCAPOS is improved by 1% - 9% compared with classification and regression tree, and more interpretable and the processing of knowledge is more reasonable compared with attribute partial order structure. Keywords: Formal concept analysis, Three-way decision, Attribute partial order structure, Causal inference, Causal factor 1. Introduction Attribute partial order structure analysis (APOSA) is an important method in the field of Concept-cognitive learning (CCL) [4, 31, 32, 19], which explores the relationship between attributes from the perspective of human cognition.

artificial intelligence, decision tree learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.17482

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hebei Province (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Using Connected Vehicle Trajectory Data to Evaluate the Effects of Speeding

Ugan, Jorge, Abdel-Aty, Mohamed, Islam, Zubayer

arXiv.org Artificial IntelligenceMar-28-2023

Speeding has been and continues to be a major contributing factor to traffic fatalities. Various transportation agencies have proposed speed management strategies to reduce the amount of speeding on arterials. While there have been various studies done on the analysis of speeding proportions above the speed limit, few studies have considered the effect on the individual's journey. Many studies utilized speed data from detectors, which is limited in that there is no information of the route that the driver took. This study aims to explore the effects of various roadway features an individual experiences for a given journey on speeding proportions. Connected vehicle trajectory data was utilized to identify the path that a driver took, along with the vehicle related variables. The level of speeding proportion is predicted using multiple learning models. The model with the best performance, Extreme Gradient Boosting, achieved an accuracy of 0.756. The proposed model can be used to understand how the environment and vehicle's path effects the drivers' speeding behavior, as well as predict the areas with high levels of speeding proportions. The results suggested that features related to an individual driver's trip, i.e., total travel time, has a significant contribution towards speeding. Features that are related to the environment of the individual driver's trip, i.e., proportion of residential area, also had a significant effect on reducing speeding proportions. It is expected that the findings could help inform transportation agencies more on the factors related to speeding for an individual driver's trip.

artificial intelligence, journey, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2303.16396

Country:

North America > United States > Florida > Orange County > Orlando (0.28)
Europe > Italy (0.14)
North America > United States > New Jersey (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.94)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Add feedback

Synthetic Combinations: A Causal Inference Framework for Combinatorial Interventions

Agarwal, Abhineet, Agarwal, Anish, Vijaykumar, Suhas

arXiv.org Artificial IntelligenceMar-24-2023

We consider a setting with $N$ heterogeneous units and $p$ interventions. Our goal is to learn unit-specific potential outcomes for any combination of these $p$ interventions, i.e., $N \times 2^p$ causal parameters. Choosing combinations of interventions is a problem that naturally arises in many applications such as factorial design experiments, recommendation engines (e.g., showing a set of movies that maximizes engagement for users), combination therapies in medicine, selecting important features for ML models, etc. Running $N \times 2^p$ experiments to estimate the various parameters is infeasible as $N$ and $p$ grow. Further, with observational data there is likely confounding, i.e., whether or not a unit is seen under a combination is correlated with its potential outcome under that combination. To address these challenges, we propose a novel model that imposes latent structure across both units and combinations. We assume latent similarity across units (i.e., the potential outcomes matrix is rank $r$) and regularity in how combinations interact (i.e., the coefficients in the Fourier expansion of the potential outcomes is $s$ sparse). We establish identification for all causal parameters despite unobserved confounding. We propose an estimation procedure, Synthetic Combinations, and establish finite-sample consistency under precise conditions on the observation pattern. Our results imply Synthetic Combinations consistently estimates unit-specific potential outcomes given $\text{poly}(r) \times (N + s^2p)$ observations. In comparison, previous methods that do not exploit structure across both units and combinations have sample complexity scaling as $\min(N \times s^2p, \ \ r \times (N + 2^p))$. We use Synthetic Combinations to propose a data-efficient experimental design mechanism for combinatorial causal inference. We corroborate our theoretical findings with numerical simulations.

artificial intelligence, decision tree learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2303.14226

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)

Add feedback

Counterfactually Fair Regression with Double Machine Learning

Rehill, Patrick

arXiv.org Artificial IntelligenceMar-20-2023

Counterfactual fairness is an approach to AI fairness that tries to make decisions based on the outcomes that an individual with some kind of sensitive status would have had without this status. This paper proposes Double Machine Learning (DML) Fairness which analogises this problem of counterfactual fairness in regression problems to that of estimating counterfactual outcomes in causal inference under the Potential Outcomes framework. It uses arbitrary machine learning methods to partial out the effect of sensitive variables on nonsensitive variables and outcomes. Assuming that the effects of the two sets of variables are additively separable, outcomes will be approximately equalised and individual-level outcomes will be counterfactually fair. This paper demonstrates the approach in a simulation study pertaining to discrimination in workplace hiring and an application on real data estimating the GPAs of law school students. It then discusses when it is appropriate to apply such a method to problems of real-world discrimination where constructs are conceptually complex and finally, whether DML Fairness can achieve justice in these settings.

artificial intelligence, fairness, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2303.11529

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Education > Curriculum > Subject-Specific Education (0.69)
Education > Educational Setting > Higher Education (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)

Add feedback

QUBO Decision Tree: Annealing Machine Extends Decision Tree Splitting

Yawata, Koichiro, Osakabe, Yoshihiro, Okuyama, Takuya, Asahara, Akinori

arXiv.org Artificial IntelligenceMar-17-2023

This paper proposes an extension of regression trees by quadratic unconstrained binary optimization (QUBO). Regression trees are very popular prediction models that are trainable with tabular datasets, but their accuracy is insufficient because the decision rules are too simple. The proposed method extends the decision rules in decision trees to multi-dimensional boundaries. Such an extension is generally unimplementable because of computational limitations, however, the proposed method transforms the training process to QUBO, which enables an annealing machine to solve this problem.

artificial intelligence, constraint, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICKG55886.2022.00052

2303.09772

Country:

Asia > Japan (0.04)
North America > United States > New York (0.04)
North America > United States > Iowa > Story County > Ames (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback