AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Tokyo Paralympics: Sarah Storey wins record-breaking 17th gold in women's C4-5 road race

BBC NewsSep-2-2021, 07:54:40 GMT

Sarah Storey wins her 17th Paralympic gold by defending her women's C4-5 road race title to become Great Britain's most successful Paralympian of all time.

sarah storey win, storey win record-breaking 17th gold, tokyo paralympic, (2 more...)

BBC News

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.99)
Europe > United Kingdom (0.24)

Industry: Leisure & Entertainment > Sports > Olympic Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

When are Deep Networks really better than Random Forests at small sample sizes?

Xu, Haoyin, Ainsworth, Michael, Peng, Yu-Chung, Kusmanov, Madi, Panda, Sambit, Vogelstein, Joshua T.

arXiv.org Artificial IntelligenceAug-31-2021

Random forests (RF) and deep networks (DN) are two of the most popular machine learning methods in the current scientific literature and yield differing levels of performance on different data modalities. We wish to further explore and establish the conditions and domains in which each approach excels, particularly in the context of sample size and feature dimension. To address these issues, we tested the performance of these approaches across tabular, image, and audio settings using varying model parameters and architectures. Our focus is on datasets with at most 10,000 samples, which represent a large fraction of scientific and biomedical datasets. In general, we found RF to excel at tabular and structured data (image and audio) with small sample sizes, whereas DN performed better on structured data with larger sample sizes. Although we plan to continue updating this technical report in the coming months, we believe the current preliminary results may be of interest to others.

forest and network, latexit latexit sha1, sample size, (12 more...)

arXiv.org Artificial Intelligence

2108.13637

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Estonia > Harju County > Tallinn (0.04)

Genre: Research Report (0.93)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.85)

Add feedback

Decision Tree-Based Predictive Models for Academic Achievement Using College Students' Support Networks

Frazier, Anthony, Silva, Joethi, Meilak, Rachel, Sahoo, Indranil, Chan, David, Broda, Michael

arXiv.org Machine LearningAug-31-2021

In this study, we examine a set of primary data collected from 484 students enrolled in a large public university in the Mid-Atlantic United States region during the early stages of the COVID-19 pandemic. The data, called Ties data, included students' demographic and support network information. The support network data comprised of information that highlighted the type of support, (i.e. emotional or educational; routine or intense). Using this data set, models for predicting students' academic achievement, quantified by their self-reported GPA, were created using Chi-Square Automatic Interaction Detection (CHAID), a decision tree algorithm, and cforest, a random forest algorithm that uses conditional inference trees. We compare the methods' accuracy and variation in the set of important variables suggested by each algorithm. Each algorithm found different variables important for different student demographics with some overlap. For White students, different types of educational support were important in predicting academic achievement, while for non-White students, different types of emotional support were important in predicting academic achievement. The presence of differing types of routine support were important in predicting academic achievement for cisgender women, while differing types of intense support were important in predicting academic achievement for cisgender men.

academic achievement, cisgender woman, student, (16 more...)

arXiv.org Machine Learning

2108.13947

Country:

Asia > China (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Iran (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education > Educational Setting > Higher Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Survival Prediction of Heart Failure Patients using Stacked Ensemble Machine Learning Algorithm

Zaman, S. M Mehedi, Qureshi, Wasay Mahmood, Raihan, Md. Mohsin Sarker, Monjur, Ocean, Shams, Abdullah Bin

arXiv.org Machine LearningAug-30-2021

Cardiovascular disease, especially heart failure is one of the major health hazard issues of our time and is a leading cause of death worldwide. Advancement in data mining techniques using machine learning (ML) models is paving promising prediction approaches. Data mining is the process of converting massive volumes of raw data created by the healthcare institutions into meaningful information that can aid in making predictions and crucial decisions. Collecting various follow-up data from patients who have had heart failures, analyzing those data, and utilizing several ML models to predict the survival possibility of cardiovascular patients is the key aim of this study. Due to the imbalance of the classes in the dataset, Synthetic Minority Oversampling Technique (SMOTE) has been implemented. Two unsupervised models (K-Means and Fuzzy C-Means clustering) and three supervised classifiers (Random Forest, XGBoost and Decision Tree) have been used in our study. After thorough investigation, our results demonstrate a superior performance of the supervised ML algorithms over unsupervised models. Moreover, we designed and propose a supervised stacked ensemble learning model that can achieve an accuracy, precision, recall and F1 score of 99.98%. Our study shows that only certain attributes collected from the patients are imperative to successfully predict the surviving possibility post heart failure, using supervised ML algorithms.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

2108.13367

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Representation of binary classification trees with binary features by quantum circuits

Heese, Raoul, Bickert, Patricia, Niederle, Astrid Elisa

arXiv.org Machine LearningAug-30-2021

We propose a quantum representation of binary classification trees with binary features based on a probabilistic approach. By using the quantum computer as a processor for probability distributions, a probabilistic traversal of the decision tree can be realized via measurements of a quantum circuit. We describe how tree inductions and the prediction of class labels of query data can be integrated into this framework. An on-demand sampling method enables predictions with a constant number of classical memory slots, independent of the tree depth. We experimentally study our approach using both a quantum computing simulator and actual IBM quantum hardware. To our knowledge, this is the first realization of a decision tree classifier on a quantum device.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

arXiv.org Machine Learning

2108.13207

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Information Technology (0.48)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

To tune or not to tune? An Approach for Recommending Important Hyperparameters

Bahmani, Mohamadjavad, Shawi, Radwa El, Potikyan, Nshan, Sakr, Sherif

arXiv.org Artificial IntelligenceAug-30-2021

Novel technologies in automated machine learning ease the complexity of algorithm selection and hyperparameter optimization. Hyperparameters are important for machine learning models as they significantly influence the performance of machine learning models. Many optimization techniques have achieved notable success in hyperparameter tuning and surpassed the performance of human experts. However, depending on such techniques as blackbox algorithms can leave machine learning practitioners without insight into the relative importance of different hyperparameters. In this paper, we consider building the relationship between the performance of the machine learning models and their hyperparameters to discover the trend and gain insights, with empirical results based on six classifiers and 200 datasets. Our results enable users to decide whether it is worth conducting a possibly time-consuming tuning strategy, to focus on the most important hyperparameters, and to choose adequate hyperparameter spaces for tuning. The results of our experiments show that gradient boosting and Adaboost outperform other classifiers across 200 problems. However, they need tuning to boost their performance. Overall, the results obtained from this study provide a quantitative basis to focus efforts toward guided automated hyperparameter optimization and contribute toward the development of better-automated machine learning frameworks.

algorithm, dataset, hyperparameter, (17 more...)

arXiv.org Artificial Intelligence

2108.13066

Country:

Europe > Estonia > Tartu County > Tartu (0.05)
Asia > Singapore (0.05)

Genre:

Research Report > New Finding (0.90)
Research Report > Experimental Study (0.89)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Decision Tree Algorithm -A Complete Guide - Analytics Vidhya

#artificialintelligenceAug-29-2021, 20:01:13 GMT

Till now we have learned about linear regression, logistic regression, and they were pretty hard to understand. Let's now start with Decision tree's and I assure you this is probably the easiest algorithm in Machine Learning. There's not much mathematics involved here. Since it is very easy to use and interpret it is one of the most widely used and practical methods used in Machine Learning. Root Nodes – It is the node present at the beginning of a decision tree from this node the population starts dividing according to various features.

decision tree, entropy, node, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.98)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

Add feedback

Identification of the Resting Position Based on EGG, ECG, Respiration Rate and SpO2 Using Stacked Ensemble Learning

Raihan, Md. Mohsin Sarker, Islam, Muhammad Muinul, Fairoz, Fariha, Shams, Abdullah Bin

arXiv.org Machine LearningAug-26-2021

Rest is essential for a high-level physiological and psychological performance. It is also necessary for the muscles to repair, rebuild, and strengthen. There is a significant correlation between the quality of rest and the resting posture. Therefore, identification of the resting position is of paramount importance to maintain a healthy life. Resting postures can be classified into four basic categories: Lying on the back (supine), facing of the left / right sides and free-fall position. The later position is already considered to be an unhealthy posture by researchers equivocally and hence can be eliminated. In this paper, we analyzed the other three states of resting position based on the data collected from the physiological parameters: Electrogastrogram (EGG), Electrocardiogram (ECG), Respiration Rate, Heart Rate, and Oxygen Saturation (SpO2). Based on these parameters, the resting position is classified using a hybrid stacked ensemble machine learning model designed using the Decision tree, Random Forest, and Xgboost algorithms. Our study demonstrates a 100% accurate prediction of the resting position using the hybrid model. The proposed method of identifying the resting position based on physiological parameters has the potential to be integrated into wearable devices. This is a low cost, highly accurate and autonomous technique to monitor the body posture while maintaining the user privacy by eliminating the use of RGB camera conventionally used to conduct the polysomnography (sleep Monitoring) or resting position studies.

accuracy, algorithm, resting position, (14 more...)

arXiv.org Machine Learning

2108.11604

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.89)
Health & Medicine > Diagnostic Medicine > Vital Signs (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

FARF: A Fair and Adaptive Random Forests Classifier

Zhang, Wenbin, Bifet, Albert, Zhang, Xiangliang, Weiss, Jeremy C., Nejdl, Wolfgang

arXiv.org Artificial IntelligenceAug-21-2021

As Artificial Intelligence (AI) is used in more applications, the need to consider and mitigate biases from the learned models has followed. Most works in developing fair learning algorithms focus on the offline setting. However, in many real-world applications data comes in an online fashion and needs to be processed on the fly. Moreover, in practical application, there is a trade-off between accuracy and fairness that needs to be accounted for, but current methods often have multiple hyperparameters with non-trivial interaction to achieve fairness. In this paper, we propose a flexible ensemble algorithm for fair decision-making in the more challenging context of evolving online settings. This algorithm, called FARF (Fair and Adaptive Random Forests), is based on using online component classifiers and updating them according to the current distribution, that also accounts for fairness and a single hyperparameters that alters fairness-accuracy balance. Experiments on real-world discriminated data streams demonstrate the utility of FARF.

discrimination, fairness, farf, (15 more...)

arXiv.org Artificial Intelligence

2108.07403

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > Maryland > Baltimore County (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.74)

Add feedback

Gradient Boosted Decision Trees explained with a real-life example and some Python code

#artificialintelligenceAug-19-2021, 06:10:37 GMT

Gradient Boosting algorithms tackle one of the biggest problems in Machine Learning: bias. Decision Trees is a simple and flexible algorithm. An underfit Decision Tree has low depth, meaning it splits the dataset only a few of times in an attempt to separate the data. Because it doesn't separate the dataset into more and more distinct observations, it can't capture the true patterns in it. When it comes to tree-based algorithms Random Forests was revolutionary, because it used Bagging to reduce the overall variance of the model with an ensemble of random trees.

algorithm, decision tree, weak learner, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.92)

Add feedback