AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Yet Another Representation of Binary Decision Trees: A Mathematical Demonstration

Zhang, Jinxiong

arXiv.org Artificial IntelligenceMay-10-2024

A decision tree looks like a simple computational graph without cycles, where only the leaf nodes specify the output values and the non-terminals specify their tests or split conditions. From the numerical perspective, we express decision trees in the language of computational graph. We explicitly parameterize the test phase, traversal phase and prediction phase of decision trees based on the bitvectors of non-terminal nodes. As shown later, the decision tree is a shallow binary network in some sense. Especially, we introduce the bitvector matrix to implement the tree traversal in numerical approach, where the core is to convert the logical `AND' operation to arithmetic operations. And we apply this numerical representation to extend and unify diverse decision trees in concept.

decision tree, node, representation, (14 more...)

arXiv.org Artificial Intelligence

2101.07077

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Privacy-Preserving Edge Federated Learning for Intelligent Mobile-Health Systems

Aminifar, Amin, Shokri, Matin, Aminifar, Amir

arXiv.org Artificial IntelligenceMay-9-2024

Machine Learning (ML) algorithms are generally designed for scenarios in which all data is stored in one data center, where the training is performed. However, in many applications, e.g., in the healthcare domain, the training data is distributed among several entities, e.g., different hospitals or patients' mobile devices/sensors. At the same time, transferring the data to a central location for learning is certainly not an option, due to privacy concerns and legal issues, and in certain cases, because of the communication and computation overheads. Federated Learning (FL) is the state-of-the-art collaborative ML approach for training an ML model across multiple parties holding local data samples, without sharing them. However, enabling learning from distributed data over such edge Internet of Things (IoT) systems (e.g., mobile-health and wearable technologies, involving sensitive personal/medical data) in a privacy-preserving fashion presents a major challenge mainly due to their stringent resource constraints, i.e., limited computing capacity, communication bandwidth, memory storage, and battery lifetime. In this paper, we propose a privacy-preserving edge FL framework for resource-constrained mobile-health and wearable technologies over the IoT infrastructure. We evaluate our proposed framework extensively and provide the implementation of our technique on Amazon's AWS cloud platform based on the seizure detection application in epilepsy monitoring using wearable technologies.

architecture, local update, privacy, (17 more...)

arXiv.org Artificial Intelligence

2405.05611

Country:

Europe > Germany (0.04)
Europe > Sweden (0.04)
Europe > France (0.04)
(16 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.67)

Add feedback

Aligning Tutor Discourse Supporting Rigorous Thinking with Tutee Content Mastery for Predicting Math Achievement

Abdelshiheed, Mark, Jacobs, Jennifer K., D'Mello, Sidney K.

arXiv.org Artificial IntelligenceMay-9-2024

This work investigates how tutoring discourse interacts with students' proximal knowledge to explain and predict students' learning outcomes. Our work is conducted in the context of high-dosage human tutoring where 9th-grade students (N = 1080) attended small group tutorials and individually practiced problems on an Intelligent Tutoring System (ITS). We analyzed whether tutors' talk moves and students' performance on the ITS predicted scores on math learning assessments. We trained Random Forest Classifiers (RFCs) to distinguish high and low assessment scores based on tutor talk moves, student's ITS performance metrics, and their combination. A decision tree was extracted from each RFC to yield an interpretable model. We found AUCs of 0.63 for talk moves, 0.66 for ITS, and 0.77 for their combination, suggesting interactivity among the two feature sources. Specifically, the best decision tree emerged from combining the tutor talk moves that encouraged rigorous thinking and students' ITS mastery. In essence, tutor talk that encouraged mathematical reasoning predicted achievement for students who demonstrated high mastery on the ITS, whereas tutors' revoicing of students' mathematical ideas and contributions was predictive for students with low ITS mastery. Implications for practice are discussed.

assessment score, student, talk move, (16 more...)

arXiv.org Artificial Intelligence

2405.06218

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Europe > Sweden > Östergötland County > Linköping (0.04)
North America > United States > Virginia > Fairfax County > Reston (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre:

Research Report > New Finding (0.93)
Instructional Material > Course Syllabus & Notes (0.88)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
Education > Assessment & Standards (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.80)

Add feedback

Tree-based Ensemble Learning for Out-of-distribution Detection

Shen, Zhaiming, Wang, Menglun, Cheng, Guang, Lai, Ming-Jun, Mu, Lin, Huang, Ruihao, Liu, Qi, Zhu, Hao

arXiv.org Artificial IntelligenceMay-5-2024

Being able to successfully determine whether the testing samples has similar distribution as the training samples is a fundamental question to address before we can safely deploy most of the machine learning models into practice. In this paper, we propose TOOD detection, a simple yet effective tree-based out-of-distribution (TOOD) detection mechanism to determine if a set of unseen samples will have similar distribution as of the training samples. The TOOD detection mechanism is based on computing pairwise hamming distance of testing samples' tree embeddings, which are obtained by fitting a tree-based ensemble model through in-distribution training samples. Our approach is interpretable and robust for its tree-based nature. Furthermore, our approach is efficient, flexible to various machine learning tasks, and can be easily generalized to unsupervised setting. Extensive experiments are conducted to show the proposed method outperforms other state-of-the-art out-of-distribution detection methods in distinguishing the in-distribution from out-of-distribution on various tabular, image, and text data.

dataset, detection result, tood detection result, (16 more...)

arXiv.org Artificial Intelligence

2405.0306

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > Maryland > Montgomery County > Silver Spring (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.69)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Towards a theory of model distillation

Boix-Adsera, Enric

arXiv.org Artificial IntelligenceMay-4-2024

Distillation is the task of replacing a complicated machine learning model with a simpler model that approximates the original [BCNM06,HVD15]. Despite many practical applications, basic questions about the extent to which models can be distilled, and the runtime and amount of data needed to distill, remain largely open. To study these questions, we initiate a general theory of distillation, defining PAC-distillation in an analogous way to PAC-learning [Val84]. As applications of this theory: (1) we propose new algorithms to extract the knowledge stored in the trained weights of neural networks -- we show how to efficiently distill neural networks into succinct, explicit decision tree representations when possible by using the ``linear representation hypothesis''; and (2) we prove that distillation can be much cheaper than learning from scratch, and make progress on characterizing its complexity.

decision tree, distillation, neural network, (11 more...)

arXiv.org Artificial Intelligence

2403.09053

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Colorado (0.04)
(5 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)

Add feedback

Mathematics of statistical sequential decision-making: concentration, risk-awareness and modelling in stochastic bandits, with applications to bariatric surgery

Saux, Patrick

arXiv.org Machine LearningMay-3-2024

This thesis aims to study some of the mathematical challenges that arise in the analysis of statistical sequential decision-making algorithms for postoperative patients follow-up. Stochastic bandits (multiarmed, contextual) model the learning of a sequence of actions (policy) by an agent in an uncertain environment in order to maximise observed rewards. To learn optimal policies, bandit algorithms have to balance the exploitation of current knowledge and the exploration of uncertain actions. Such algorithms have largely been studied and deployed in industrial applications with large datasets, low-risk decisions and clear modelling assumptions, such as clickthrough rate maximisation in online advertising. By contrast, digital health recommendations call for a whole new paradigm of small samples, risk-averse agents and complex, nonparametric modelling. To this end, we developed new safe, anytime-valid concentration bounds, (Bregman, empirical Chernoff), introduced a new framework for risk-aware contextual bandits (with elicitable risk measures) and analysed a novel class of nonparametric bandit algorithms under weak assumptions (Dirichlet sampling). In addition to the theoretical guarantees, these results are supported by in-depth empirical evidence. Finally, as a first step towards personalised postoperative follow-up recommendations, we developed with medical doctors and surgeons an interpretable machine learning model to predict the long-term weight trajectories of patients after bariatric surgery.

gastric bypass and sleeve gastrectomy, individual preoperative prediction, time-uniform bregman concentration, (17 more...)

arXiv.org Machine Learning

2405.01994

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.27)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Brazil > São Paulo (0.04)
(20 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(2 more...)

Industry:

Health & Medicine > Therapeutic Area > Nutrition and Weight Loss (1.00)
Health & Medicine > Therapeutic Area > Internal Medicine (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
(4 more...)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(5 more...)

Add feedback

Quantum Machine Learning: Quantum Kernel Methods

Naguleswaran, Sanjeev

arXiv.org Artificial IntelligenceMay-2-2024

Quantum algorithms based on quantum kernel methods have been investigated previously [1]. A quantum advantage is derived from the fact that it is possible to construct a family of datasets for which, only quantum processing can recognise the intrinsic labelling patterns, while for classical computers the dataset looks like noise. This is due to the algorithm leveraging inherent efficiencies in the computation of logarithms in a cyclic group. The discrete log problem.is a well-known advantage of quantum vs classical computation: where it is possible to generate all the members of the group using a single mathematical operation. Kernel methods are a powerful and popular technique in classical Machine Learning. The use of a quantum feature space that can only be calculated efficiently on a quantum computer potentially allows for deriving a quantum advantage. In this paper, we intend to first describe the application of such a kernel method to a Quantum version of the classical Support Vector Machine (SVM) algorithm to identify conditions under which, a quantum advantage is realised. A data dependent projected quantum kernel was shown to provide significant advantage over classical kernels. Further, we present results of investigations and ideas pertaining to extending the use of quantum kernels as a feature extraction layer in a Convolutional Neural Networks (CNN) that is a widely used architecture in deep-learning applications.

algorithm, kernel, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2405.0178

Country: North America > United States > California > Monterey County > Monterey (0.04)

Genre: Research Report (0.40)

Industry: Banking & Finance (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (1.00)
(2 more...)

Add feedback

Estimate the building height at a 10-meter resolution based on Sentinel data

Yan, Xin

arXiv.org Artificial IntelligenceMay-1-2024

Building height is an important indicator for scientific research and practical application. However, building height products with a high spatial resolution (10m) are still very scarce. To meet the needs of high-resolution building height estimation models, this study established a set of spatial-spectral-temporal feature databases, combining SAR data provided by Sentinel-1, optical data provided by Sentinel-2, and shape data provided by building footprints. The statistical indicators on the time scale are extracted to form a rich database of 160 features. This study combined with permutation feature importance, Shapley Additive Explanations, and Random Forest variable importance, and the final stable features are obtained through an expert scoring system. This study took 12 large, medium, and small cities in the United States as the training data. It used moving windows to aggregate the pixels to solve the impact of SAR image displacement and building shadows. This study built a building height model based on a random forest model and compared three model ensemble methods of bagging, boosting, and stacking. To evaluate the accuracy of the prediction results, this study collected Lidar data in the test area, and the evaluation results showed that its R-Square reached 0.78, which can prove that the building height can be obtained effectively. The fast production of high-resolution building height data can support large-scale scientific research and application in many fields.

building height, prediction, resolution, (12 more...)

arXiv.org Artificial Intelligence

2405.00989

Country:

North America > United States > Iowa > Pottawattamie County (0.14)
North America > United States > Iowa > Black Hawk County (0.14)
North America > United States > Iowa > Dubuque County (0.14)
(13 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Energy (0.70)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

On the Impact of Data Heterogeneity in Federated Learning Environments with Application to Healthcare Networks

Barbieri, Usevalad Milasheuski. Luca, Tedeschini, Bernardo Camajori, Nicoli, Monica, Savazzi, Stefano

arXiv.org Artificial IntelligenceMay-1-2024

Federated Learning (FL) allows multiple privacy-sensitive applications to leverage their dataset for a global model construction without any disclosure of the information. One of those domains is healthcare, where groups of silos collaborate in order to generate a global predictor with improved accuracy and generalization. However, the inherent challenge lies in the high heterogeneity of medical data, necessitating sophisticated techniques for assessment and compensation. This paper presents a comprehensive exploration of the mathematical formalization and taxonomy of heterogeneity within FL environments, focusing on the intricacies of medical data. In particular, we address the evaluation and comparison of the most popular FL algorithms with respect to their ability to cope with quantity-based, feature and label distribution-based heterogeneity. The goal is to provide a quantitative evaluation of the impact of data heterogeneity in FL systems for healthcare networks as well as a guideline on FL algorithm selection. Our research extends beyond existing studies by benchmarking seven of the most common FL algorithms against the unique challenges posed by medical data use cases. The paper targets the prediction of the risk of stroke recurrence through a set of tabular clinical reports collected by different federated hospital silos: data heterogeneity frequently encountered in this scenario and its impact on FL performance are discussed.

algorithm, dataset, heterogeneity, (14 more...)

arXiv.org Artificial Intelligence

2404.18519

Country: Europe > Italy > Lombardy > Milan (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)

Add feedback

Semi-Supervised Hierarchical Multi-Label Classifier Based on Local Information

Serrano-Pérez, Jonathan, Sucar, L. Enrique

arXiv.org Artificial IntelligenceApr-30-2024

Scarcity of labeled data is a common problem in supervised classification, since hand-labeling can be time consuming, expensive or hard to label; on the other hand, large amounts of unlabeled information can be found. The problem of scarcity of labeled data is even more notorious in hierarchical classification, because the data of a node is split among its children, which results in few instances associated to the deepest nodes of the hierarchy. In this work it is proposed the semi-supervised hierarchical multi-label classifier based on local information (SSHMC-BLI) which can be trained with labeled and unlabeled data to perform hierarchical classification tasks. The method can be applied to any type of hierarchical problem, here we focus on the most difficult case: hierarchies of DAG type, where the instances can be associated to multiple paths of labels which can finish in an internal node. SSHMC-BLI builds pseudo-labels for each unlabeled instance from the paths of labels of its labeled neighbors, while it considers whether the unlabeled instance is similar to its neighbors. Experiments on 12 challenging datasets from functional genomics show that making use of unlabeled along with labeled data can help to improve the performance of a supervised hierarchical classifier trained only on labeled data, even with statistical significance.

classification, classifier, hierarchy, (16 more...)

arXiv.org Artificial Intelligence

2405.00184

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)

Add feedback