AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

Watermarking Decision Tree Ensembles

Calzavara, Stefano, Cazzaro, Lorenzo, Gera, Donald, Orlando, Salvatore

arXiv.org Artificial IntelligenceOct-6-2024

Protecting the intellectual property of machine learning models is a hot topic and many watermarking schemes for deep neural networks have been proposed in the literature. Unfortunately, prior work largely neglected the investigation of watermarking techniques for other types of models, including decision tree ensembles, which are a state-of-the-art model for classification tasks on non-perceptual data. In this paper, we present the first watermarking scheme designed for decision tree ensembles, focusing in particular on random forest models. We discuss watermark creation and verification, presenting a thorough security analysis with respect to possible attacks. We finally perform an experimental evaluation of the proposed scheme, showing excellent results in terms of accuracy and security against the most relevant threats.

attacker, ensemble, signature, (16 more...)

arXiv.org Artificial Intelligence

2410.0457

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > Italy > Veneto > Venice (0.05)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
(11 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Application of AI in Credit Risk Scoring for Small Business Loans: A case study on how AI-based random forest model improves a Delphi model outcome in the case of Azerbaijani SMEs

Karimova, Nigar

arXiv.org Artificial IntelligenceOct-5-2024

The research investigates how the application of a machine-learning random forest model improves the accuracy and precision of a Delphi model. The context of the research is Azerbaijani SMEs and the data for the study has been obtained from a financial institution which had gathered it from the enterprises (as there is no public data on local SMEs, it was not practical to verify the data independently). The research used accuracy, precision, recall and F-1 scores for both models to compare them and run the algorithms in Python. The findings showed that accuracy, precision, recall and F- 1 all improve considerably (from 0.69 to 0.83, from 0.65 to 0.81, from 0.56 to 0.77 and from 0.58 to 0.79, respectively). The implications are that by applying AI models in credit risk modeling, financial institutions can improve the accuracy of identifying potential defaulters which would reduce their credit risk. In addition, an unfair rejection of credit access for SMEs would also go down having a significant contribution to an economic growth in the economy. Finally, such ethical issues as transparency of algorithms and biases in historical data should be taken on board while making decisions based on AI algorithms in order to reduce mechanical dependence on algorithms that cannot be justified in practice.

artificial intelligence, machine learning, random forest model, (16 more...)

arXiv.org Artificial Intelligence

2410.0533

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Energy > Oil & Gas (1.00)
Banking & Finance > Credit (1.00)
Banking & Finance > Risk Management (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Bootstrap Sampling Rate Greater than 1.0 May Improve Random Forest Performance

Kaźmierczak, Stanisław, Mańdziuk, Jacek

arXiv.org Machine LearningOct-5-2024

Random forests utilize bootstrap sampling to create an individual training set for each component tree. This involves sampling with replacement, with the number of instances equal to the size of the original training set (N). Research literature indicates that drawing fewer than N observations can also yield satisfactory results. The ratio of the number of observations in each bootstrap sample to the total number of training instances is called the bootstrap rate (BR). Sampling more than N observations (BR > 1) has been explored in the literature only to a limited extent and has generally proven ineffective. In this paper, we re-examine this approach using 36 diverse datasets and consider BR values ranging from 1.2 to 5.0. Contrary to previous findings, we show that such parameterization can result in statistically significant improvements in classification accuracy compared to standard settings (BR 1). Furthermore, we investigate what the optimal BR depends on and conclude that it is more a property of the dataset than a dependence on the random forest hyperparameters. Finally, we develop a binary classifier to predict whether the optimal BR is 1 or > 1 for a given dataset, achieving between 81.88% and 88.81% accuracy, depending on the experiment configuration. Random forest (RF) algorithm, introduced by Breiman (2001), is an ensemble of decision trees (DTs) that collectively make decisions using either majority or soft voting. RF reduces variance, sometimes at the cost of slightly increasing bias, by introducing two sources of randomness.

configuration, dataset, rf configuration, (16 more...)

arXiv.org Machine Learning

2410.04297

Country: Europe > Poland > Masovia Province > Warsaw (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Therapeutic Area (0.98)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Comparative study of regression vs pairwise models for surrogate-based heuristic optimisation

Naharro, Pablo S., Toharia, Pablo, LaTorre, Antonio, Peña, José-María

arXiv.org Artificial IntelligenceOct-4-2024

Heuristic optimisation algorithms explore the search space by sampling solutions, evaluating their fitness, and biasing the search in the direction of promising solutions. However, in many cases, this fitness function involves executing expensive computational calculations, drastically reducing the reasonable number of evaluations. In this context, surrogate models have emerged as an excellent alternative to alleviate these computational problems. This paper addresses the formulation of surrogate problems as both regression models that approximate fitness (surface surrogate models) and a novel way to connect classification models (pairwise surrogate models). The pairwise approach can be directly exploited by some algorithms, such as Differential Evolution, in which the fitness value is not actually needed to drive the search, and it is sufficient to know whether a solution is better than another one or not. Based on these modelling approaches, we have conducted a multidimensional analysis of surrogate models under different configurations: different machine learning algorithms (regularised regression, neural networks, decision trees, boosting methods, and random forests), different surrogate strategies (encouraging diversity or relaxing prediction thresholds), and compare them for both surface and pairwise surrogate models. The experimental part of the article includes the benchmark problems already proposed for the SOCO2011 competition in continuous optimisation and a simulation problem included in the recent GECCO2021 Industrial Challenge. This paper shows that the performance of the overall search, when using online machine learning-based surrogate models, depends not only on the accuracy of the predictive model but also on both the kind of bias towards positive or negative cases and how the optimisation uses those predictions to decide whether to execute the actual fitness function.

algorithm, diver, surrogate model, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.swevo.2022.101176

2410.03409

Country:

Europe > Spain > Galicia > Madrid (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)

Industry:

Education (0.67)
Health & Medicine (0.46)
Construction & Engineering (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(3 more...)

Add feedback

Label Distribution Learning Forests

Wei Shen, KAI ZHAO, Yilu Guo, Alan L. Yuille

Neural Information Processing SystemsOct-3-2024, 14:47:27 GMT

Label distribution learning (LDL) is a general learning framework, which assigns to an instance a distribution over a set of labels rather than a single label or multiple labels. Current LDL methods have either restricted assumptions on the expression form of the label distribution or limitations in representation learning, e.g., to learn deep features in an end-to-end manner. This paper presents label distribution learning forests (LDLFs) - a novel label distribution learning algorithm based on differentiable decision trees, which have several advantages: 1) Decision trees have the potential to model any general form of label distributions by a mixture of leaf node predictions.

decision tree, label distribution, learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

LightGBM: A Highly Efficient Gradient Boosting Decision Tree

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu

Neural Information Processing SystemsOct-3-2024, 12:59:16 GMT

Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted in these implementations, the efficiency and scalability are still unsatisfactory when the feature dimension is high and data size is large. A major reason is that for each feature, they need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming. To tackle this problem, we propose two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB). With GOSS, we exclude a significant proportion of data instances with small gradients, and only use the rest to estimate the information gain. We prove that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size. With EFB, we bundle mutually exclusive features (i.e., they rarely take nonzero values simultaneously), to reduce the number of features. We prove that finding the optimal bundling of exclusive features is NP-hard, but a greedy algorithm can achieve quite good approximation ratio (and thus can effectively reduce the number of features without hurting the accuracy of split point determination by much).

algorithm, dataset, gradient, (15 more...)

Neural Information Processing Systems

Country:

Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Variable Importance Using Decision Trees

Jalil Kazemitabar, Arash Amini, Adam Bloniarz, Ameet S. Talwalkar

Neural Information Processing SystemsOct-3-2024, 10:45:36 GMT

Neural Information Processing Systems http://nips.cc/

ds tump, probability, uncorrelated design, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Cost efficient gradient boosting

Sven Peter, Ferran Diego, Fred A. Hamprecht, Boaz Nadler

Neural Information Processing SystemsOct-3-2024, 10:00:38 GMT

Many applications require learning classifiers or regressors that are both accurate and cheap to evaluate. Prediction cost can be drastically reduced if the learned predictor is constructed such that on the majority of the inputs, it uses cheap features and fast evaluations. The main challenge is to do so with little loss in accuracy. In this work we propose a budget-aware strategy based on deep boosted regression trees. In contrast to previous approaches to learning with cost penalties, our method can grow very deep trees that on average are nonetheless cheap to compute. We evaluate our method on a number of datasets and find that it outperforms the current state of the art by a large margin. Our algorithm is easy to implement and its learning time is comparable to that of the original gradient boosting.

algorithm, cost penalty, gradient, (15 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)

Add feedback

Ranking Perspective for Tree-based Methods with Applications to Symbolic Feature Selection

Luo, Hengrui, Li, Meng

arXiv.org Machine LearningOct-3-2024

Tree-based methods are powerful nonparametric techniques in statistics and machine learning. However, their effectiveness, particularly in finite-sample settings, is not fully understood. Recent applications have revealed their surprising ability to distinguish transformations (which we call symbolic feature selection) that remain obscure under current theoretical understanding. This work provides a finite-sample analysis of tree-based methods from a ranking perspective. We link oracle partitions in tree methods to response rankings at local splits, offering new insights into their finite-sample behavior in regression and feature selection tasks. Building on this local ranking perspective, we extend our analysis in two ways: (i) We examine the global ranking performance of individual trees and ensembles, including Classification and Regression Trees (CART) and Bayesian Additive Regression Trees (BART), providing finite-sample oracle bounds, ranking consistency, and posterior contraction results. (ii) Inspired by the ranking perspective, we propose concordant divergence statistics $\mathcal{T}_0$ to evaluate symbolic feature mappings and establish their properties. Numerical experiments demonstrate the competitive performance of these statistics in symbolic feature selection tasks compared to existing methods.

partition, ranking perspective, selection, (14 more...)

arXiv.org Machine Learning

2410.02623

Country:

North America > United States > Texas > Harris County > Houston (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.04)

Genre: Research Report (1.00)

Industry:

Government > Regional Government (0.45)
Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Maximum Margin Interval Trees

Alexandre Drouin, Toby Hocking, Francois Laviolette

Neural Information Processing SystemsOct-2-2024, 22:39:46 GMT

Learning a regression function using censored or interval-valued output data is an important problem in fields such as genomics and medicine. The goal is to learn a real-valued prediction function, and the training output labels indicate an interval of possible values. Whereas most existing algorithms for this task are linear models, in this paper we investigate learning nonlinear tree models. We propose to learn a tree by minimizing a margin-based discriminative objective function, and we provide a dynamic programming algorithm for computing the optimal solution in log-linear time. We show empirically that this algorithm achieves state-of-the-art speed and prediction accuracy in a benchmark of several data sets.

algorithm, hinge loss, time complexity, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.28)
North America > United States > Iowa (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback