AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Top 10 Machine Learning Algorithms for Beginners Machine Learning Tutorial [Data Science]

#artificialintelligenceNov-27-2019, 02:36:45 GMT

This Machine Learning Algorithms Tutorial video by Learnaholic India will help you learn Machine Learning Tutorial, what is Machine Learning, [Data Science] various Machine Learning problems and the algorithms, key Machine Learning algorithms with simple examples. The key Machine Learning algorithms discussed in detail are Linear Regression, Logistic Regression, Decision Tree, Random Forest and KNN algorithm. Machine Learning Tutorial [Data Science] Top 10 Machine Learning Algorithms for Beginners In this Machine Learning Algorithms Tutorial video you will understand: 1) Types of Machine Learning Algorithms (00:25) 2) Supervised Learning Algorithms (00:30) 3) Unsupervised Learning Algorithms (1:59) 4) Reinforcement Learning Algorithms (3:38) 5) Top 10 Machine Learning Algorithms for Beginners (4:33) This Machine Learning Algorithms Tutorial shall teach you what machine learning is, and the various ways in which you can use machine learning to solve a problem! Towards the end, you will learn how to prepare a data-set for model creation and validation and how you can create a model using any machine learning algorithm! Hit the subscribe button above.

algorithm, learning algorithm, machine learning algorithm, (6 more...)

#artificialintelligence

Country: Asia > India (0.27)

Genre: Instructional Material > Course Syllabus & Notes (0.60)

Industry: Education (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.60)

Add feedback

Adaptive Estimation of Multivariate Piecewise Polynomials and Bounded Variation Functions by Optimal Decision Trees

Chatterjee, Sabyasachi, Goswami, Subhajit

arXiv.org Machine LearningNov-26-2019

Proposed by Donoho (1997), Dyadic CART is a nonparametric regression method which computes a globally optimal dyadic decision tree and fits piecewise constant functions. In this article we define and study Dyadic CART and a closely related estimator, namely Optimal Regression Tree (ORT), in the context of estimating piecewise smooth functions in general dimensions. More precisely, these optimal decision tree estimators fit piecewise polynomials of any given degree. Like Dyadic CART in two dimensions, we reason that these estimators can also be computed in polynomial time in the sample size via dynamic programming. We prove oracle inequalities for the finite sample risk of Dyadic CART and ORT which imply tight risk bounds for several function classes of interest. Firstly, they imply that the finite sample risk of ORT of order $r \geq 0$ is always bounded by $C k \frac{\log N}{N}$ ($N$ is the sample size) whenever the regression function is piecewise polynomial of degree $r$ on some reasonably regular axis aligned rectangular partition of the domain with at most $k$ rectangles. Beyond the univariate case, such guarantees are scarcely available in the literature for computationally efficient estimators. Secondly, our oracle inequalities uncover optimality and adaptivity of the Dyadic CART estimator for function spaces with bounded variation. We consider two function spaces of recent interest where multivariate total variation denoising and univariate trend filtering are the state of the art methods. We show that Dyadic CART enjoys certain advantages over these estimators while still maintaining all their known guarantees.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

1911.11562

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Add feedback

Single Sample Feature Importance: An Interpretable Algorithm for Low-Level Feature Analysis

Gatto, Joseph, Lanka, Ravi, Iwashita, Yumi, Stoica, Adrian

arXiv.org Machine LearningNov-26-2019

Have you ever wondered how your feature space is impacting the prediction of a specific sample in your dataset? In this paper, we introduce Single Sample Feature Importance (SSFI), which is an interpretable feature importance algorithm that allows for the identification of the most important features that contribute to the prediction of a single sample. When a dataset can be learned by a Random Forest classifier or regressor, SSFI shows how the Random Forest's prediction path can be utilized for low-level feature importance calculation. SSFI results in a relative ranking of features, highlighting those with the greatest impact on a data point's prediction. We demonstrate these results both numerically and visually on four different datasets.

dataset, prediction, ssfi, (14 more...)

arXiv.org Machine Learning

1911.11901

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom (0.04)
Europe > Ireland (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Neural Random Forest Imitation

Reinders, Christoph, Rosenhahn, Bodo

arXiv.org Machine LearningNov-25-2019

Existing methods produce very inefficient architectures and do not scale. In this paper, we introduce a new method for generating data from a random forest and learning a neural network that imitates it. Without any additional training data, this transformation creates very efficient neural networks that learn the decision boundaries of a random forest. The generated model is fully differentiable and can be combined with the feature extraction in a single pipeline enabling further end-to-end processing. Experiments on several real-world benchmark datasets demonstrate outstanding performance in terms of scalability, accuracy, and learning with very few training examples. Compared to state-of-the-art mappings, we significantly reduce the network size while achieving the same or even improved accuracy due to better generalization.

decision tree, neural network, random forest, (14 more...)

arXiv.org Machine Learning

1911.10829

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Lower Saxony > Hanover (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Tuning Random Forest on Time Series Data STATWORX

#artificialintelligenceNov-23-2019, 12:31:40 GMT

I am a data scientist at STATWORX, and I enjoy making data make sense.

hyperparameter, time series data, time sery, (13 more...)

#artificialintelligence

Country:

North America > United States > New York (0.05)
Europe > Switzerland > Zürich > Zürich (0.05)
Europe > Austria > Vienna (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.44)

Add feedback

Imputing missing values with unsupervised random trees

Cortes, David

arXiv.org Machine LearningNov-21-2019

When designing statistical models from tabular data for supervised learning tasks such as regression or classification, oftentimes it happens that some of th e observations available for fitting such models are missing values in one or more variables, usually d ue to reasons such as poor data collection practices, loss of information, participants dropping out of a survey, or similar. Many methods such as [2] or [4] overcome this issue by using heuristics to handle missing information - decision tree methods in particular, due to their splitting nature that takes one variable at a time, are particularly well suited for implicit han dling of missing data without a-priori imputation ([16]), but other methods such as gene ralized linear models or support vector machines cannot handle missing values in the same wa y, and when using them on a dataset with missing entries, these entries have to either be dr opped or imputed. Typical strategies for imputing the missing entries include: replacing them with the column mean or median, determining the most similar observations (nearest neighbors) according to the non-missing variables and taking a simple or weighted average of the m issing variable(s) from them ([11]), producing a latent representation of the data by some low-rank matrix factorization that minimizes errors on the non-missing entries and from which the m issing entries are then reconstructed ([10]), and iterative imputation that starts with so me basic imputation for all values and then cycles through each variable by constructing a mod el to predict the missing values from the non-missing observations, replacing the earlier impu tation with the model prediction and repeating until convergence ([3], [18]).

faircutforest, imputation, iterative, (16 more...)

arXiv.org Machine Learning

1911.06646

Country: North America > United States > California (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback

Response Transformation and Profit Decomposition for Revenue Uplift Modeling

Gubela, Robin M., Lessmann, Stefan, Jaroszewicz, Szymon

arXiv.org Machine LearningNov-20-2019

Uplift models support decision-making in marketing campaign planning. Estimating the causal effect of a marketing treatment, an uplift model facilitates targeting communication to responsive customers and efficient allocation of marketing budgets. Research into uplift models focuses on conversion models to maximize incremental sales. The paper introduces uplift modeling strategies for maximizing incremental revenues. If customers differ in their spending behavior, revenue maximization is a more plausible business objective compared to maximizing conversions. The proposed methodology entails a transformation of the prediction target, customer-level revenues, that facilitates implementing a causal uplift model using standard machine learning algorithms. The distribution of campaign revenues is typically zero-inflated because of many non-buyers. Remedies to this modeling challenge are incorporated in the proposed revenue uplift strategies in the form of two-stage models. Empirical experiments using real-world e-commerce data confirm the merits of the proposed revenue uplift strategy over relevant alternatives including uplift models for conver-sion and recently developed causal machine learning algorithms. To quantify the degree to which improved targeting decisions raise return on marketing, the paper develops a decomposition of campaign profit. Applying the decomposition to a digital coupon targeting campaign, the paper provides evidence that revenue uplift modeling, as well as causal machine learning, can improve cam-paign profit substantially.

algorithm, customer, uplift model, (13 more...)

arXiv.org Machine Learning

doi: 10.1016/j.ejor.2019.11.030

1911.08729

Country:

North America > United States > New York (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > District of Columbia > Washington (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Marketing (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)
Information Technology > Services (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)

Add feedback

LionForests: Local Interpretation of Random Forests through Path Selection

Mollas, Ioannis, Tsoumakas, Grigorios, Bassiliades, Nick

arXiv.org Artificial IntelligenceNov-20-2019

Towards a future where machine learning systems will integrate into every aspect of people's lives, researching methods to interpret such systems is necessary, instead of focusing exclusively on enhancing their performance. Enriching the trust between these systems and people will accelerate this integration process. Many medical and retail banking/finance applications use state-of-the-art machine learning techniques to predict certain aspects of new instances. Tree ensembles, like random forests, are widely acceptable solutions on these tasks, while at the same time they are avoided due to their black-box uninterpretable nature, creating an unreasonable paradox. In this paper, we provide a sequence of actions for shedding light on the predictions of the misjudged family of tree ensemble algorithms. Using classic unsupervised learning techniques and an enhanced similarity metric, to wander among transparent trees inside a forest following breadcrumbs, the interpretable essence of tree ensembles arises. An explanation provided by these systems using our approach, which we call "LionForests", can be a simple, comprehensive rule.

association rule, explanation, prediction, (14 more...)

arXiv.org Artificial Intelligence

1911.0878

Country: