AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

FLInt: Exploiting Floating Point Enabled Integer Arithmetic for Efficient Random Forest Inference

Hakert, Christian, Chen, Kuan-Hsun, Chen, Jian-Jia

arXiv.org Artificial IntelligenceSep-9-2022

In many machine learning applications, e.g., tree-based ensembles, floating point numbers are extensively utilized due to their expressiveness. Nowadays performing data analysis on embedded devices from dynamic data masses becomes available, but such systems often lack hardware capabilities to process floating point numbers, introducing large overheads for their processing. Even if such hardware is present in general computing systems, using integer operations instead of floating point operations promises to reduce operation overheads and improve the performance. In this paper, we provide \mdname, a full precision floating point comparison for random forests, by only using integer and logic operations. To ensure the same functionality preserves, we formally prove the correctness of this comparison. Since random forests only require comparison of floating point numbers during inference, we implement \mdname~in low level realizations and therefore eliminate the need for floating point hardware entirely, by keeping the model accuracy unchanged. The usage of \mdname~basically boils down to a one-by-one replacement of conditions: For instance, a comparison statement in C: if(pX[3]<=(float)10.074347) becomes if((*(((int*)(pX))+3))<=((int)(0x41213087))). Experimental evaluation on X86 and ARMv8 desktop and server class systems shows that the execution time can be reduced by up to $\approx 30\%$ with our novel approach.

artificial intelligence, implementation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2209.04181

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.93)

Add feedback

An Assessment Tool for Academic Research Managers in the Third World

Delbianco, Fernando, Fioriti, Andres, Tohmé, Fernando

arXiv.org Machine LearningSep-7-2022

Academic and scientific research activities have a growing economic importance, requiring larger outlays and investments both in advanced and emerging nations. In both cases it is a matter of national prestige but more than that, of strategic relevance, since the presence of large numbers of highly educated citizens contributing to the advance of human knowledge has well-established impacts on technological and industrial capabilities. These, in turn, are highly relevant to ensure the competitiveness and the economic security of nations as well as yielding other benefits to the economy (Dasgupta and David (1994) and Salter and Martin (2001)). The care and promotion of research activities should thus be a focus of public policies aimed at ensuring social and economic development (Stephan (2012) and Etzkowitz (2013)). A particularly pressing issue in this matter is to assess the quality of the production generated by researchers in all fields of knowledge. On one hand, it is of interest to detect areas in which nationals have international impact, as to concentrate resources on them.

artificial intelligence, database, machine learning, (19 more...)

arXiv.org Machine Learning

2209.03199

Country:

South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Netherlands (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Government (0.34)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)

Add feedback

A Survey of Neural Trees

Li, Haoling, Song, Jie, Xue, Mengqi, Zhang, Haofei, Ye, Jingwen, Cheng, Lechao, Song, Mingli

arXiv.org Artificial IntelligenceSep-7-2022

Neural networks (NNs) and decision trees (DTs) are both popular models of machine learning, yet coming with mutually exclusive advantages and limitations. To bring the best of the two worlds, a variety of approaches are proposed to integrate NNs and DTs explicitly or implicitly. In this survey, these approaches are organized in a school which we term as neural trees (NTs). This survey aims to present a comprehensive review of NTs and attempts to identify how they enhance the model interpretability. We first propose a thorough taxonomy of NTs that expresses the gradual integration and co-evolution of NNs and DTs. Afterward, we analyze NTs in terms of their interpretability and performance, and suggest possible solutions to the remaining challenges. Finally, this survey concludes with a discussion about other considerations like conditional computation and promising directions towards this field. A list of papers reviewed in this survey, along with their corresponding codes, is available at: https://github.com/zju-vipa/awesome-neural-trees

class hierarchy, ndt, neural network, (15 more...)

arXiv.org Artificial Intelligence

2209.03415

Country:

Asia > China (0.05)
Asia > Singapore > Central Region > Singapore (0.04)
North America > United States (0.04)
(3 more...)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(6 more...)

Add feedback

Personalized Game Difficulty Prediction Using Factorization Machines

Kristensen, Jeppe Theiss, Guckelsberger, Christian, Burelli, Paolo, Hämäläinen, Perttu

arXiv.org Artificial IntelligenceSep-6-2022

The accurate and personalized estimation of task difficulty provides many opportunities for optimizing user experience. However, user diversity makes such difficulty estimation hard, in that empirical measurements from some user sample do not necessarily generalize to others. In this paper, we contribute a new approach for personalized difficulty estimation of game levels, borrowing methods from content recommendation. Using factorization machines (FM) on a large dataset from a commercial puzzle game, we are able to predict difficulty as the number of attempts a player requires to pass future game levels, based on observed attempt counts from earlier levels and levels played by others. In addition to performance and scalability, FMs offer the benefit that the learned latent variable model can be used to study the characteristics of both players and game levels that contribute to difficulty. We compare the approach to a simple non-personalized baseline and a personalized prediction using Random Forests. Our results suggest that FMs are a promising tool enabling game designers to both optimize player experience and learn more about their players and the game.

artificial intelligence, machine learning, prediction, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3526113.3545624

2209.13495

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > Finland (0.04)
Oceania > Australia > Western Australia > Perth (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
(3 more...)

Add feedback

Evolutionary bagging for ensemble learning

Ngo, Giang, Beard, Rodney, Chandra, Rohitash

arXiv.org Artificial IntelligenceSep-5-2022

Ensemble learning has gained success in machine learning with major advantages over other learning methods. Bagging is a prominent ensemble learning method that creates subgroups of data, known as bags, that are trained by individual machine learning methods such as decision trees. Random forest is a prominent example of bagging with additional features in the learning process. Evolutionary algorithms have been prominent for optimisation problems and also been used for machine learning. Evolutionary algorithms are gradient-free methods that work with a population of candidate solutions that maintain diversity for creating new solutions. In conventional bagged ensemble learning, the bags are created once and the content, in terms of the training examples, are fixed over the learning process. In our paper, we propose evolutionary bagged ensemble learning, where we utilise evolutionary algorithms to evolve the content of the bags in order to iteratively enhance the ensemble by providing diversity in the bags. The results show that our evolutionary ensemble bagging method outperforms conventional ensemble methods (bagging and random forests) for several benchmark datasets under certain constraints. We find that evolutionary bagging can inherently sustain a diverse set of bags without reduction in performance accuracy.

algorithm, ensemble, evobagging, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.neucom.2022.08.055

2208.024

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Rectifying Mono-Label Boolean Classifiers

Coste-Marquis, Sylvie, Marquis, Pierre

arXiv.org Artificial IntelligenceSep-5-2022

We elaborate on the notion of rectification of a Boolean classifier $\Sigma$. Given $\Sigma$ and some background knowledge $T$, postulates characterizing the way $\Sigma$ must be changed into a new classifier $\Sigma \star T$ that complies with $T$ have already been presented. We focus here on the specific case of mono-label Boolean classifiers, i.e., there is a single target concept and any instance is classified either as positive (an element of the concept), or as negative (an element of the complementary concept). In this specific case, our main contribution is twofold: (1) we show that there is a unique rectification operator $\star$ satisfying the postulates, and (2) when $\Sigma$ and $T$ are Boolean circuits, we show how a classification circuit equivalent to $\Sigma \star T$ can be computed in time linear in the size of $\Sigma$ and $T$; when $\Sigma$ and $T$ are decision trees, a decision tree equivalent to $\Sigma \star T$ can be computed in time polynomial in the size of $\Sigma$ and $T$.

decision tree, operator, rectification operator, (15 more...)

arXiv.org Artificial Intelligence

2206.08758

Country: Europe > France (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Federated XGBoost on Sample-Wise Non-IID Data

Jones, Katelinh, Ong, Yuya Jeremy, Zhou, Yi, Baracaldo, Nathalie

arXiv.org Artificial IntelligenceSep-3-2022

Federated Learning (FL) is a paradigm for jointly training machine learning algorithms in a decentralized manner which allows for parties to communicate with an aggregator to create and train a model, without exposing the underlying raw data distribution of the local parties involved in the training process. Most research in FL has been focused on Neural Network-based approaches, however Tree-Based methods, such as XGBoost, have been underexplored in Federated Learning due to the challenges in overcoming the iterative and additive characteristics of the algorithm. Decision tree-based models, in particular XGBoost, can handle non-IID data, which is significant for algorithms used in Federated Learning frameworks since the underlying characteristics of the data are decentralized and have risks of being non-IID by nature. In this paper, we focus on investigating the effects of how Federated XGBoost is impacted by non-IID distributions by performing experiments on various sample size-based data skew scenarios and how these models perform under various non-IID scenarios. We conduct a set of extensive experiments across multiple different datasets and different data skew partitions. Our experimental results demonstrate that despite the various partition ratios, the performance of the models stayed consistent and performed close to or equally well against models that were trained in a centralized manner.

algorithm, dataset, federated learning, (14 more...)

arXiv.org Artificial Intelligence

2209.0134

Country:

North America > United States > California > Santa Clara County > San Jose (0.14)
North America > United States > Pennsylvania > Centre County > State College (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Asia > Nepal (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Efficient Learning of Interpretable Classification Rules

Ghosh, Bishwamittra (National University of Singapore) | Malioutov, Dmitry | Meel, Kuldeep S. (National University of Singapore)

Journal of Artificial Intelligence ResearchAug-30-2022

Machine learning has become omnipresent with applications in various safety-critical domains such as medical, law, and transportation. In these domains, high-stake decisions provided by machine learning necessitate researchers to design interpretable models, where the prediction is understandable to a human. In interpretable machine learning, rule-based classifiers are particularly effective in representing the decision boundary through a set of rules comprising input features. Examples of such classifiers include decision trees, decision lists, and decision sets. The interpretability of rule-based classifiers is in general related to the size of the rules, where smaller rules are considered more interpretable. To learn such a classifier, the brute-force direct approach is to consider an optimization problem that tries to learn the smallest classification rule that has close to maximum accuracy. This optimization problem is computationally intractable due to its combinatorial nature and thus, the problem is not scalable in large datasets. To this end, in this paper we study the triangular relationship among the accuracy, interpretability, and scalability of learning rule-based classifiers. The contribution of this paper is an interpretable learning framework IMLI, that is based on maximum satisfiability (MaxSAT) for synthesizing classification rules expressible in proposition logic. IMLI considers a joint objective function to optimize the accuracy and the interpretability of classification rules and learns an optimal rule by solving an appropriately designed MaxSAT query. Despite the progress of MaxSAT solving in the last decade, the straightforward MaxSAT-based solution cannot scale to practical classification datasets containing thousands to millions of samples. Therefore, we incorporate an efficient incremental learning technique inside the MaxSAT formulation by integrating mini-batch learning and iterative rule-learning. The resulting framework learns a classifier by iteratively covering the training data, wherein in each iteration, it solves a sequence of smaller MaxSAT queries corresponding to each mini-batch. In our experiments, IMLI achieves the best balance among prediction accuracy, interpretability, and scalability. For instance, IMLI attains a competitive prediction accuracy and interpretability w.r.t. existing interpretable classifiers and demonstrates impressive scalability on large datasets where both interpretable and non-interpretable classifiers fail. As an application, we deploy IMLI in learning popular interpretable classifiers such as decision lists and decision sets. The source code is available at https://github.com/meelgroup/mlic.

classifier, dataset, learning, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13482

AI Access Foundation

13482

Journal of Artificial Intelligence Research

Country:

Asia > Singapore (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Wisconsin (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.67)
Health & Medicine > Diagnostic Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Algebraically Explainable Controllers: Decision Trees and Support Vector Machines Join Forces

Jüngermann, Florian, Křetínský, Jan, Weininger, Maximilian

arXiv.org Artificial IntelligenceAug-29-2022

Recently, decision trees (DT) have been used as an explainable representation of controllers (a.k.a. strategies, policies, schedulers). Although they are often very efficient and produce small and understandable controllers for discrete systems, complex continuous dynamics still pose a challenge. In particular, when the relationships between variables take more complex forms, such as polynomials, they cannot be obtained using the available DT learning procedures. In contrast, support vector machines provide a more powerful representation, capable of discovering many such relationships, but not in an explainable form. Therefore, we suggest to combine the two frameworks in order to obtain an understandable representation over richer, domain-relevant algebraic predicates. We demonstrate and evaluate the proposed method experimentally on established benchmarks.

controller, decision tree, predicate, (16 more...)

arXiv.org Artificial Intelligence

2208.12804

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(11 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Machine Learning Model Interpretability and Explainability

#artificialintelligenceAug-28-2022, 16:58:58 GMT

ML/AI models are getting more complex and challenging to interpret and explain. A simple, easy-to-explain regression or decision tree model can no longer fully satisfy technical and business needs. More and more people use ensemble methods and deep neural networks to get better predictions and accuracy. However, those more complex models are hard to explain, debug, and understand. Thus, many people call these models black-box models.

explanation, interpretml, learning model interpretability and explainability, (11 more...)

#artificialintelligence

Industry: Banking & Finance (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback