AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

An Open-Source Tool for Classification Models in Resource-Constrained Hardware

da Silva, Lucas Tsutsui, Souza, Vinicius M. A., Batista, Gustavo E. A. P. A.

arXiv.org Artificial IntelligenceMay-12-2021

Abstract-- Applications that need to sense, measure, and gather real-time information from the environment frequently face three main restrictions: power consumption, cost, and lack of infrastructure. Most of the challenges imposed by these limitations can be better addressed by embedding Machine Learning (ML) classifiers in the hardware that senses the environment, creating smart sensors able to interpret the low-level data stream. However, for this approach to be cost-effective, we need highly efficient classifiers suitable to execute in unresourceful hardware, such as low-power microcontrollers. In this paper, we present an open-source tool named EmbML - Embedded Machine Learning that implements a pipeline to develop classifiers for resource-constrained hardware. We describe its implementation details and provide a comprehensive analysis of its classifiers considering accuracy, classification time, and memory usage. Moreover, we compare the performance of its classifiers with classifiers produced by related tools to demonstrate that our tool provides a diverse set of classification algorithms that are both compact and accurate. Therefore, these smart sensors are more powerefficient since they eliminate the need for communicating all the raw data. PPLICATIONS that need to sense, measure, and gather real-time information from the environment frequently of interest - e.g., a dry soil crop area that needs watering or face three main restrictions [1]: power consumption, cost, the capture of a disease-vector mosquito.

classifier, embml, microcontroller, (16 more...)

arXiv.org Artificial Intelligence

2105.05983

Country:

North America > Canada (0.14)
South America > Brazil > São Paulo (0.04)
Oceania > Australia > New South Wales (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
(3 more...)

Add feedback

Comparing interpretability and explainability for feature selection

Dunn, Jack, Mingardi, Luca, Zhuo, Ying Daisy

arXiv.org Machine LearningMay-11-2021

A common approach for feature selection is to examine the variable importance scores for a machine learning model, as a way to understand which features are the most relevant for making predictions. Given the significance of feature selection, it is crucial for the calculated importance scores to reflect reality. Falsely overestimating the importance of irrelevant features can lead to false discoveries, while underestimating importance of relevant features may lead us to discard important features, resulting in poor model performance. Additionally, black-box models like XGBoost provide state-of-the art predictive performance, but cannot be easily understood by humans, and thus we rely on variable importance scores or methods for explainability like SHAP to offer insight into their behavior. In this paper, we investigate the performance of variable importance as a feature selection method across various black-box and interpretable machine learning methods. We compare the ability of CART, Optimal Trees, XGBoost and SHAP to correctly identify the relevant subset of variables across a number of experiments. The results show that regardless of whether we use the native variable importance method or SHAP, XGBoost fails to clearly distinguish between relevant and irrelevant features. On the other hand, the interpretable methods are able to correctly and efficiently identify irrelevant features, and thus offer significantly better performance for feature selection.

experiment, feature selection, unique value, (14 more...)

arXiv.org Machine Learning

2105.05328

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)

Add feedback

Learning stochastic decision trees

Blanc, Guy, Lange, Jane, Tan, Li-Yang

arXiv.org Machine LearningMay-8-2021

We give a quasipolynomial-time algorithm for learning stochastic decision trees that is optimally resilient to adversarial noise. Given an $\eta$-corrupted set of uniform random samples labeled by a size-$s$ stochastic decision tree, our algorithm runs in time $n^{O(\log(s/\varepsilon)/\varepsilon^2)}$ and returns a hypothesis with error within an additive $2\eta + \varepsilon$ of the Bayes optimal. An additive $2\eta$ is the information-theoretic minimum. Previously no non-trivial algorithm with a guarantee of $O(\eta) + \varepsilon$ was known, even for weaker noise models. Our algorithm is furthermore proper, returning a hypothesis that is itself a decision tree; previously no such algorithm was known even in the noiseless setting.

algorithm, decision tree, stochastic decision tree, (16 more...)

arXiv.org Machine Learning

2105.03594

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

An Extensive Analytical Approach on Human Resources using Random Forest Algorithm

papineni, Swarajya lakshmi v, Reddy, A. Mallikarjuna, yarlagadda, Sudeepti, Yarlagadda, Snigdha, Akkinen, Haritha

arXiv.org Artificial IntelligenceMay-7-2021

The current job survey shows that most software employees are planning to change their job role due to high pay for recent jobs such as data scientists, business analysts and artificial intelligence fields. The survey also indicated that work life imbalances, low pay, uneven shifts and many other factors also make employees think about changing their work life. In this paper, for an efficient organisation of the company in terms of human resources, the proposed system designed a model with the help of a random forest algorithm by considering different employee parameters. This helps the HR department retain the employee by identifying gaps and helping the organisation to run smoothly with a good employee retention ratio. This combination of HR and data science can help the productivity, collaboration and well-being of employees of the organisation. It also helps to develop strategies that have an impact on the performance of employees in terms of external and social factors.

algorithm, class label, entropy, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.14445/22315381/IJETT-V69I5P217

2105.07855

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Singapore (0.04)
Asia > India > Telangana > Hyderabad (0.04)
(3 more...)

Genre: Research Report (0.65)

Industry:

Information Technology (0.88)
Education > Educational Setting (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.74)

Add feedback

Accelerating Entrepreneurial Decision-Making Through Hybrid Intelligence

Dellermann, Dominik

arXiv.org Artificial IntelligenceMay-7-2021

AI - Artificial Intelligence AGI - Artificial General Intelligence ANN - Artificial Neural Network ANOVA - Analysis of Variance ANT - Actor Network Theory API - Application Programming Interface APX - Amsterdam Power Exchange AVE - Average Variance Extracted BU - Business Unit CART - Classification and Regression Tree CBMV - Crowd-based Business Model Validation CR - Composite Reliability CT - Computed Tomography CVC - Corporate Venture Capital DR - Design Requirement DP - Design Principle DSR - Design Science Research DSS - Decision Support System EEX - European Energy Exchange FsQCA - Fuzzy-Set Qualitative Comparative Analysis GUI - Graphical User Interface HI-DSS - Hybrid Intelligence Decision Support System HIT - Human Intelligence Task IoT - Internet of Things IS - Information System IT - Information Technology MCC - Matthews Correlation Coefficient ML - Machine Learning OCT - Opportunity Creation Theory OGEMA 2.0 - Open Gateway Energy Management 2.0 OS - Operating System R&D - Research & Development RE - Renewable Energies RQ - Research Question SVM - Support Vector Machine SSD - Solid-State Drive SDK - Software Development Kit TCP/IP - Transmission Control Protocol/Internet Protocol TCT - Transaction Cost Theory UI - User Interface VaR - Value at Risk VC - Venture Capital VPP - Virtual Power Plant Chapter I

collaboratively co-creating innovative value proposition, intelligence decisional guidance design paradigm, opportunity creation require multi-directional interaction, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.17170/kobra-202004301196

2105.03365

Country:

Europe > Netherlands > North Holland > Amsterdam (0.24)
North America > United States > California (0.13)
North America > United States > New York (0.04)
(13 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Information Technology > Software (1.00)
Energy > Renewable (1.00)
(6 more...)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
(10 more...)

Add feedback

Universal Consistency of Decision Trees in High Dimensions

Klusowski, Jason M.

arXiv.org Machine LearningMay-7-2021

This paper shows that decision trees constructed with Classification and Regression Trees (CART) methodology are universally consistent in an additive model context, even when the number of predictor variables scales exponentially with the sample size, under certain $1$-norm sparsity constraints. The consistency is universal in the sense that there are no a priori assumptions on the distribution of the predictor variables. Amazingly, this adaptivity to (approximate or exact) sparsity is achieved with a single tree, as opposed to what might be expected for an ensemble. Finally, we show that these qualitative properties of individual trees are inherited by Breiman's random forests. Another surprise is that consistency holds even when the "mtry" tuning parameter vanishes as a fraction of the number of predictor variables, thus speeding up computation of the forest. A key step in the analysis is the establishment of an oracle inequality, which precisely characterizes the goodness-of-fit and complexity tradeoff for a misspecified model.

decision tree, klusowski universal consistency, theorem 4, (14 more...)

arXiv.org Machine Learning

2104.13881

Country: North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Learning Linear Temporal Properties from Noisy Data: A MaxSAT Approach

Gaglione, Jean-Raphaël, Neider, Daniel, Roy, Rajarshi, Topcu, Ufuk, Xu, Zhe

arXiv.org Artificial IntelligenceApr-30-2021

We address the problem of inferring descriptions of system behavior using Linear Temporal Logic (LTL) from a finite set of positive and negative examples. Most of the existing approaches for solving such a task rely on predefined templates for guiding the structure of the inferred formula. The approaches that can infer arbitrary LTL formulas, on the other hand, are not robust to noise in the data. To alleviate such limitations, we devise two algorithms for inferring concise LTL formulas even in the presence of noise. Our first algorithm infers minimal LTL formulas by reducing the inference problem to a problem in maximum satisfiability and then using off-the-shelf MaxSAT solvers to find a solution. To the best of our knowledge, we are the first to incorporate the usage of MaxSAT solvers for inferring formulas in LTL. Our second learning algorithm relies on the first algorithm to derive a decision tree over LTL formulas based on a decision tree learning algorithm. We have implemented both our algorithms and verified that our algorithms are efficient in extracting concise LTL descriptions even in the presence of noise.

algorithm, decision tree, formula, (16 more...)

arXiv.org Artificial Intelligence

2104.15083

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Arizona (0.04)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
Europe > France (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Land Cover Classification

#artificialintelligenceApr-24-2021, 20:10:27 GMT

Earth Engine, also referred to as Google Earth Engine, provides a cloud-computing platform for Remote Sensings, such as satellite image processing. We can use Javascript or Python to code Earth Engine. There are many kinds of Remote Sensing analyses available to run. In this article, we will discuss specifically Machine Learning for land cover classification based on satellite images. Before we get into the details, I want to describe more on Remote Sensing common knowledge because I assume some readers have Data Science, Machine Learning, or Statistics backgrounds.

earth engine, satellite image, vegetation, (9 more...)

#artificialintelligence

Country: Asia > Indonesia > Borneo > Kalimantan (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.51)

Add feedback

Feature Inference Attack on Model Predictions in Vertical Federated Learning

Luo, Xinjian, Wu, Yuncheng, Xiao, Xiaokui, Ooi, Beng Chin

arXiv.org Artificial IntelligenceApr-22-2021

Federated learning (FL) is an emerging paradigm for facilitating multiple organizations' data collaboration without revealing their private data to each other. Recently, vertical FL, where the participating organizations hold the same set of samples but with disjoint features and only one organization owns the labels, has received increased attention. This paper presents several feature inference attack methods to investigate the potential privacy leakages in the model prediction stage of vertical FL. The attack methods consider the most stringent setting that the adversary controls only the trained vertical FL model and the model predictions, relying on no background information. We first propose two specific attacks on the logistic regression (LR) and decision tree (DT) models, according to individual prediction output. We further design a general attack method based on multiple prediction outputs accumulated by the adversary to handle complex models, such as neural networks (NN) and random forest (RF) models. Experimental evaluations demonstrate the effectiveness of the proposed attacks and highlight the need for designing private mechanisms to protect the prediction outputs in vertical FL.

artificial intelligence, decision tree learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICDE51399.2021.00023

2010.10152

Country:

North America > United States > California (0.14)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees

Roth, Aaron M., Liang, Jing, Manocha, Dinesh

arXiv.org Artificial IntelligenceApr-21-2021

We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments with moving obstacles or targets. Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm. In order to increase the reliability and handle the failure cases of the expert policy, we combine with a policy extraction technique to transform the resulting policy into a decision tree format. The resulting decision tree has properties which we use to analyze and modify the policy and improve performance on navigation metrics including smoothness, frequency of oscillation, frequency of immobilization, and obstruction of target. We are able to modify the policy to address these imperfections without retraining, combining the learning power of deep learning with the control of domain-specific algorithms. We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.

decision tree, node, robot, (13 more...)

arXiv.org Artificial Intelligence

2104.10818

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback