AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling

Li, Boyang, Lan, Zhiling, Papka, Michael E.

arXiv.org Artificial IntelligenceMar-24-2024

In the field of high-performance computing (HPC), there has been recent exploration into the use of deep reinforcement learning for cluster scheduling (DRL scheduling), which has demonstrated promising outcomes. However, a significant challenge arises from the lack of interpretability in deep neural networks (DNN), rendering them as black-box models to system managers. This lack of model interpretability hinders the practical deployment of DRL scheduling. In this work, we present a framework called IRL (Interpretable Reinforcement Learning) to address the issue of interpretability of DRL scheduling. The core idea is to interpret DNN (i.e., the DRL policy) as a decision tree by utilizing imitation learning. Unlike DNN, decision tree models are non-parametric and easily comprehensible to humans. To extract an effective and efficient decision tree, IRL incorporates the Dataset Aggregation (DAgger) algorithm and introduces the notion of critical state to prune the derived decision tree. Through trace-based experiments, we demonstrate that IRL is capable of converting a black-box DNN policy into an interpretable rulebased decision tree while maintaining comparable scheduling performance. Additionally, IRL can contribute to the setting of rewards in DRL scheduling.

decision tree, irl, scheduling, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/MASCOTS59514.2023.10387651

2403.16293

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Transportation (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Supervised Learning via Ensembles of Diverse Functional Representations: the Functional Voting Classifier

Riccio, Donato, Maturo, Fabrizio, Romano, Elvira

arXiv.org Machine LearningMar-23-2024

Many conventional statistical and machine learning methods face challenges when applied directly to high dimensional temporal observations. In recent decades, Functional Data Analysis (FDA) has gained widespread popularity as a framework for modeling and analyzing data that are, by their nature, functions in the domain of time. Although supervised classification has been extensively explored in recent decades within the FDA literature, ensemble learning of functional classifiers has only recently emerged as a topic of significant interest. Thus, the latter subject presents unexplored facets and challenges from various statistical perspectives. The focal point of this paper lies in the realm of ensemble learning for functional data and aims to show how different functional data representations can be used to train ensemble members and how base model predictions can be combined through majority voting. The so-called Functional Voting Classifier (FVC) is proposed to demonstrate how different functional representations leading to augmented diversity can increase predictive accuracy. Many real-world datasets from several domains are used to display that the FVC can significantly enhance performance compared to individual models. The framework presented provides a foundation for voting ensembles with functional data and can stimulate a highly encouraging line of research in the FDA context.

diversity, representation, springer nature 2021, (13 more...)

arXiv.org Machine Learning

2403.15778

Country:

Europe > Italy > Campania (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.89)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.70)
(2 more...)

Add feedback

Utilizing the LightGBM Algorithm for Operator User Credit Assessment Research

Li, Shaojie, Dong, Xinqi, Ma, Danqing, Dang, Bo, Zang, Hengyi, Gong, Yulu

arXiv.org Artificial IntelligenceMar-21-2024

Mobile Internet user credit assessment is an important way for communication operators to establish decisions and formulate measures, and it is also a guarantee for operators to obtain expected benefits. However, credit evaluation methods have long been monopolized by financial industries such as banks and credit. As supporters and providers of platform network technology and network resources, communication operators are also builders and maintainers of communication networks. Internet data improves the user's credit evaluation strategy. This paper uses the massive data provided by communication operators to carry out research on the operator's user credit evaluation model based on the fusion LightGBM algorithm. First, for the massive data related to user evaluation provided by operators, key features are extracted by data preprocessing and feature engineering methods, and a multi-dimensional feature set with statistical significance is constructed; then, linear regression, decision tree, LightGBM, and other machine learning algorithms build multiple basic models to find the best basic model; finally, integrates Averaging, Voting, Blending, Stacking and other integrated algorithms to refine multiple fusion models, and finally establish the most suitable fusion model for operator user evaluation.

algorithm, basic model, lightgbm, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.54254/2755-2721/75/20240503

2403.14483

Country:

Asia > China > Fujian Province > Fuzhou (0.04)
Asia > China > Beijing > Beijing (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Banking & Finance > Credit (0.69)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)

Add feedback

Distill2Explain: Differentiable decision trees for explainable reinforcement learning in energy application controllers

Gokhale, Gargya, Madahi, Seyed Soroush Karimi, Claessens, Bert, Develder, Chris

arXiv.org Artificial IntelligenceMar-18-2024

Demand-side flexibility is gaining importance as a crucial element in the energy transition process. Accounting for about 25% of final energy consumption globally, the residential sector is an important (potential) source of energy flexibility. However, unlocking this flexibility requires developing a control framework that (1) easily scales across different houses, (2) is easy to maintain, and (3) is simple to understand for end-users. A potential control framework for such a task is data-driven control, specifically model-free reinforcement learning (RL). Such RL-based controllers learn a good control policy by interacting with their environment, learning purely based on data and with minimal human intervention. Yet, they lack explainability, which hampers user acceptance. Moreover, limited hardware capabilities of residential assets forms a hurdle (e.g., using deep neural networks). To overcome both those challenges, we propose a novel method to obtain explainable RL policies by using differentiable decision trees. Using a policy distillation approach, we train these differentiable decision trees to mimic standard RL-based controllers, leading to a decision tree-based control policy that is data-driven and easy to explain. As a proof-of-concept, we examine the performance and explainability of our proposed approach in a battery-based home energy management system to reduce energy costs. For this use case, we show that our proposed approach can outperform baseline rule-based policies by about 20-25%, while providing simple, explainable control policies. We further compare these explainable policies with standard RL policies and examine the performance trade-offs associated with this increased explainability.

controller, decision tree, explainability, (15 more...)

arXiv.org Artificial Intelligence

2403.11907

Country:

Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Portugal (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry:

Energy > Renewable (1.00)
Energy > Power Industry (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(2 more...)

Add feedback

Smooth Sensitivity for Learning Differentially-Private yet Accurate Rule Lists

Ly, Timothée, Ferry, Julien, Huguet, Marie-José, Gambs, Sébastien, Aivodji, Ulrich

arXiv.org Artificial IntelligenceMar-18-2024

Differentially-private (DP) mechanisms can be embedded into the design of a machine learningalgorithm to protect the resulting model against privacy leakage, although this often comes with asignificant loss of accuracy. In this paper, we aim at improving this trade-off for rule lists modelsby establishing the smooth sensitivity of the Gini impurity and leveraging it to propose a DP greedyrule list algorithm. In particular, our theoretical analysis and experimental results demonstrate thatthe DP rule lists models integrating smooth sensitivity have higher accuracy that those using otherDP frameworks based on global sensitivity.

global sensitivity, sensitivity, smooth sensitivity, (12 more...)

arXiv.org Artificial Intelligence

2403.13848

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)

Add feedback

Light Curve Classification with DistClassiPy: a new distance-based classifier

Chaini, Siddharth, Mahabal, Ashish, Kembhavi, Ajit, Bianco, Federica B.

arXiv.org Artificial IntelligenceMar-18-2024

The rise of synoptic sky surveys has ushered in an era of big data in time-domain astronomy, making data science and machine learning essential tools for studying celestial objects. Tree-based (e.g. Random Forests) and deep learning models represent the current standard in the field. We explore the use of different distance metrics to aid in the classification of objects. For this, we developed a new distance metric based classifier called DistClassiPy. The direct use of distance metrics is an approach that has not been explored in time-domain astronomy, but distance-based methods can aid in increasing the interpretability of the classification result and decrease the computational costs. In particular, we classify light curves of variable stars by comparing the distances between objects of different classes. Using 18 distance metrics applied to a catalog of 6,000 variable stars in 10 classes, we demonstrate classification and dimensionality reduction. We show that this classifier meets state-of-the-art performance but has lower computational requirements and improved interpretability. We have made DistClassiPy open-source and accessible at https://pypi.org/project/distclassipy/ with the goal of broadening its applications to other classification scenarios within and beyond astronomy.

classification, distance metric, metric, (13 more...)

arXiv.org Artificial Intelligence

2403.1212

Country:

North America > United States > Delaware > New Castle County > Newark (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.07)
(12 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

A New Random Forest Ensemble of Intuitionistic Fuzzy Decision Trees

Ren, Yingtao, Zhu, Xiaomin, Bai, Kaiyuan, Zhang, Runtong

arXiv.org Artificial IntelligenceMar-17-2024

Classification is essential to the applications in the field of data mining, artificial intelligence, and fault detection. There exists a strong need in developing accurate, suitable, and efficient classification methods and algorithms with broad applicability. Random forest is a general algorithm that is often used for classification under complex conditions. Although it has been widely adopted, its combination with diverse fuzzy theory is still worth exploring. In this paper, we propose the intuitionistic fuzzy random forest (IFRF), a new random forest ensemble of intuitionistic fuzzy decision trees (IFDT). Such trees in forest use intuitionistic fuzzy information gain to select features and consider hesitation in information transmission. The proposed method enjoys the power of the randomness from bootstrapped sampling and feature selection, the flexibility of fuzzy logic and fuzzy sets, and the robustness of multiple classifier systems. Extensive experiments demonstrate that the IFRF has competitative and superior performance compared to other state-of-the-art fuzzy and ensemble algorithms. IFDT is more suitable for ensemble learning with outstanding classification accuracy. This study is the first to propose a random forest ensemble based on the intuitionistic fuzzy theory.

algorithm, classifier, decision tree, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TFUZZ.2022.3215725

2403.07363

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Wisconsin (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

DTOR: Decision Tree Outlier Regressor to explain anomalies

Crupi, Riccardo, Sabatino, Alessandro Damiano, Marano, Immacolata, Brinis, Massimiliano, Albertazzi, Luca, Cirillo, Andrea, Cosentini, Andrea Claudio

arXiv.org Machine LearningMar-16-2024

Explaining outliers occurrence and mechanism of their occurrence can be extremely important in a variety of domains. Malfunctions, frauds, threats, in addition to being correctly identified, oftentimes need a valid explanation in order to effectively perform actionable counteracts. The ever more widespread use of sophisticated Machine Learning approach to identify anomalies make such explanations more challenging. We present the Decision Tree Outlier Regressor (DTOR), a technique for producing rule-based explanations for individual data points by estimating anomaly scores generated by an anomaly detection model. This is accomplished by first applying a Decision Tree Regressor, which computes the estimation score, and then extracting the relative path associated with the data point score. Our results demonstrate the robustness of DTOR even in datasets with a large number of features. Additionally, in contrast to other rule-based approaches, the generated rules are consistently satisfied by the points to be explained. Furthermore, our evaluation metrics indicate comparable performance to Anchors in outlier explanation tasks, with reduced execution time.

anomaly, decision tree outlier regressor, dtor

arXiv.org Machine Learning

2403.10903

Genre: Research Report > New Finding (0.53)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.80)

Add feedback

Variance Penalizing AdaBoost

Neural Information Processing SystemsMar-15-2024, 01:44:26 GMT

This paper proposes a novel boosting algorithm called VadaBoost which is motivated by recent empirical Bernstein bounds. VadaBoost iteratively minimizes a cost function that balances the sample mean and the sample variance of the exponential loss. Each step of the proposed algorithm minimizes the cost efficiently by providing weighted data to a weak learner rather than requiring a brute force evaluation of all possible weak learners. Thus, the proposed algorithm solves a key limitation of previous empirical Bernstein boosting methods which required brute force enumeration of all possible weak learners. Experimental results confirm that the new algorithm achieves the performance improvements of EBBoost yet goes beyond decision stumps to handle any weak learner. Significant performance gains are obtained over AdaBoost for arbitrary weak learners including decision trees (CART).

adaboost, algorithm, weak learner, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

2 Partial Models from a finite set A and the environment (stochastically) emits an observation o

Neural Information Processing SystemsMar-14-2024, 06:11:45 GMT

This paper introduces timeline trees, which are partial models of partially observable environments. Timeline trees are given some specific predictions to make and learn a decision tree over history. The main idea of timeline trees is to use temporally abstract features to identify and split on features of key events, spread arbitrarily far apart in the past (whereas previous decision-tree-based methods have been limited to a finite suffix of history). Experiments demonstrate that timeline trees can learn to make high quality predictions in complex, partially observable environments with high-dimensional observations (e.g. an arcade game).

prediction, prediction profile model, timestamp, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Pennsylvania > Lancaster County > Lancaster (0.04)

Industry: Leisure & Entertainment > Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback