AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Expert-augmented machine learning

#artificialintelligenceFeb-21-2020, 15:46:22 GMT

Machine learning is increasingly used across fields to derive insights from data, which further our understanding of the world and help us anticipate the future. The performance of predictive modeling is dependent on the amount and quality of available data. In practice, we rely on human experts to perform certain tasks and on machine learning for others. However, the optimal learning strategy may involve combining the complementary strengths of humans and machines. We present expert-augmented machine learning, an automated way to automatically extract problem-specific human expert knowledge and integrate it with machine learning to build robust, dependable, and data-efficient predictive models. Machine learning is proving invaluable across disciplines. However, its success is often limited by the quality and quantity of available data, while its adoption is limited by the level of trust afforded by given models. Human vs. machine performance is commonly compared empirically to decide whether a certain task should be performed by a computer or an expert.

clinician, dataset, knowledge, (16 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.46)

Add feedback

Learning Optimal Classification Trees: Strong Max-Flow Formulations

Aghaei, Sina, Gomez, Andres, Vayanos, Phebe

arXiv.org Machine LearningFeb-21-2020

We consider the problem of learning optimal binary classification trees. Literature on the topic has burgeoned in recent years, motivated both by the empirical suboptimality of heuristic approaches and the tremendous improvements in mixed-integer programming (MIP) technology. Yet, existing approaches from the literature do not leverage the power of MIP to its full extent. Indeed, they rely on weak formulations, resulting in slow convergence and large optimality gaps. To fill this gap in the literature, we propose a flow-based MIP formulation for optimal binary classification trees that has a stronger linear programming relaxation. Our formulation presents an attractive decomposable structure. We exploit this structure and max-flow/min-cut duality to derive a Benders' decomposition method, which scales to larger instances. We conduct extensive computational experiments on standard benchmark datasets on which we show that our proposed approaches are 50 times faster than state-of-the art MIP-based techniques and improve out of sample performance up to 13.8%.

constraint, formulation, latexit sha1, (16 more...)

arXiv.org Machine Learning

2002.09142

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Learning Global Transparent Models from Local Contrastive Explanations

Pedapati, Tejaswini, Balakrishnan, Avinash, Shanmugam, Karthikeyan, Dhurandhar, Amit

arXiv.org Artificial IntelligenceFeb-19-2020

There is a rich and growing literature on producing local point wise contrastive/counterfactual explanations for complex models. These methods highlight what is important to justify the classification and/or produce a contrast point that alters the final classification. Other works try to build globally interpretable models like decision trees and rule lists directly by efficient model search using the data or by transferring information from a complex model using distillation-like methods. Although these interpretable global models can be useful, they may not be consistent with local explanations from a specific complex model of choice. In this work, we explore the question: Can we produce a transparent global model that is consistent with/derivable from local explanations? Based on a key insight we provide a novel method where every local contrastive/counterfactual explanation can be turned into a Boolean feature. These Boolean features are sparse conjunctions of binarized features. The dataset thus constructed is consistent with local explanations by design and one can train an interpretable model like a decision tree on it. We note that this approach strictly loses information due to reliance only on sparse local explanations, nonetheless, we demonstrate empirically that in many cases it can still be competitive with respect to the complex model's performance and also other methods that learn directly from the original dataset. Our approach also provides an avenue to benchmark local explanation methods in a quantitative manner.

compactness 0, concavity 0, explanation, (16 more...)

arXiv.org Artificial Intelligence

2002.08247

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.87)

Add feedback

Federated Extra-Trees with Privacy Preserving

Liu, Yang, Chen, Mingxin, Zhang, Wenxi, Zhang, Junbo, Zheng, Yu

arXiv.org Machine LearningFeb-17-2020

It is commonly observed that the data are scattered everywhere and difficult to be centralized. The data privacy and security also become a sensitive topic. The laws and regulations such as the European Union's General Data Protection Regulation (GDPR) are designed to protect the public's data privacy. However, machine learning requires a large amount of data for better performance, and the current circumstances put deploying real-life AI applications in an extremely difficult situation. To tackle these challenges, in this paper we propose a novel privacy-preserving federated machine learning model, named Federated Extra-Trees, which applies local differential privacy in the federated trees model. A secure multi-institutional machine learning system was developed to provide superior performance by processing the modeling jointly on different clients without exchanging any raw data. We have validated the accuracy of our work by conducting extensive experiments on public datasets and the efficiency and robustness were also verified by simulating the real-world scenarios. Overall, we presented an extensible, scalable and practical solution to handle the data island problem.

differential privacy, federated extra-tree, privacy, (16 more...)

arXiv.org Machine Learning

2002.07323

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Virginia (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.66)

Add feedback

Empirical Study on Airline Delay Analysis and Prediction

Patgiri, Ripon, Hussain, Sajid, Nongmeikapam, Aditya

arXiv.org Machine LearningFeb-17-2020

The Big Data analytics are a logical analysis of very large scale datasets. The data analysis enhances an organization and improve the decision making process. In this article, we present Airline Delay Analysis and Prediction to analyze airline datasets with the combination of weather dataset. In this research work, we consider various attributes to analyze flight delay, for example, day-wise, airline-wise, cloud cover, temperature, etc. Moreover, we present rigorous experiments on various machine learning model to predict correctly the delay of a flight, namely, logistic regression with L2 regularization, Gaussian Naive Bayes, K-Nearest Neighbors, Decision Tree classifier and Random forest model. The accuracy of the Random Forest model is 82% with a delay threshold of 15 minutes of flight delay. The analysis is carried out using dataset from 1987 to 2008, the training is conducted with dataset from 2000 to 2007 and validated prediction result using 2008 data. Moreover, we have got recall 99% in the Random Forest model.

dataset, delay analysis, prediction, (12 more...)

arXiv.org Machine Learning

2002.10254

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > India (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.50)
Research Report > Experimental Study (0.36)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Implementation of Decision Trees In Python

#artificialintelligenceFeb-16-2020, 04:47:11 GMT

In the previous article, we studied Multiple Linear Regression. One thing that I believe is that if we can correlate anything with us or our lives, there are greater chances of understanding the concept. So I will try to explain everything by relating it to humans. A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains conditional control statements.

dataset, decision tree, play tennis, (11 more...)

#artificialintelligence

Country:

Asia > India (0.04)
Asia > Bangladesh (0.04)

Industry:

Leisure & Entertainment (0.52)
Media > Music (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Machine Learning for Finance in Python

#artificialintelligenceFeb-15-2020, 13:16:25 GMT

Time series data is all around us; some examples are the weather, human behavioral patterns as consumers and members of society, and financial data. In this course, you'll learn how to calculate technical indicators from historical stock data, and how to create features and targets out of the historical stock data. You'll understand how to prepare our features for linear models, xgboost models, and neural network models. We will then use linear models, decision trees, random forests, and neural networks to predict the future price of stocks in the US markets. You will also learn how to evaluate the performance of the various models we train in order to optimize them, so our predictions have enough accuracy to make a stock trading strategy profitable.

historical stock data, machine learning, python, (1 more...)

#artificialintelligence

Country: North America > United States (0.32)

Genre: Instructional Material > Course Syllabus & Notes (0.97)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)

Add feedback

(RF) 2 -- Random Forest Random Field

Payet, Nadia, Todorovic, Sinisa

Neural Information Processing SystemsFeb-15-2020, 02:57:31 GMT

We combine random forest (RF) and conditional random field (CRF) into a new computational framework, called random forest random field (RF) 2. Inference of (RF) 2 uses the Swendsen-Wang cut algorithm, characterized by Metropolis-Hastings jumps. A jump from one state to another depends on the ratio of the proposal distributions, and on the ratio of the posterior distributions of the two states. Prior work typically resorts to a parametric estimation of these four distributions, and then computes their ratio. Our key idea is to instead directly estimate these ratios using RF. RF collects in leaf nodes of each decision tree the class histograms of training examples.

class histogram, image region, random forest random field, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.88)

Add feedback

Multivariate Dyadic Regression Trees for Sparse Learning Problems

Liu, Han, Chen, Xi

Neural Information Processing SystemsFeb-15-2020, 02:14:22 GMT

We propose a new nonparametric learning method based on multivariate dyadic regression trees (MDRTs). Unlike traditional dyadic decision trees (DDTs) or classification and regression trees (CARTs), MDRTs are constructed using penalized empirical risk minimization with a novel sparsity-inducing penalty. Theoretically, we show that MDRTs can simultaneously adapt to the unknown sparsity and smoothness of the true regression functions, and achieve the nearly optimal rates of convergence (in a minimax sense) for the class of $(\alpha, C)$-smooth functions. Empirically, MDRTs can simultaneously conduct function estimation and variable selection in high dimensions. To make MDRTs applicable for large-scale learning problems, we propose a greedy heuristics.

mdrt, multivariate dyadic regression tree, sparse learning problem

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Graph-Valued Regression

Liu, Han, Chen, Xi, Wasserman, Larry, Lafferty, John D.

Neural Information Processing SystemsFeb-15-2020, 02:13:38 GMT

In many applications, it is of interest to model $Y$ given another random vector $X$ as input. We refer to the problem of estimating the graph $G(x)$ of $Y$ conditioned on $X x$ as graph-valued regression''. In this paper, we propose a semiparametric method for estimating $G(x)$ that builds a tree on the $X$ space just as in CART (classification and regression trees), but at each leaf of the tree estimates a graph. We call the method Graph-optimized CART'', or Go-CART. We study the theoretical properties of Go-CART using dyadic partitioning trees, establishing oracle inequalities on risk minimization and tree partition consistency.

go-cart, graph-valued regression, random vector, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.65)

Add feedback