AITopics | Decision Tree Learning

Collaborating Authors

Decision Tree Learning

Learning to Classify with Branching Tests: "A decision tree takes as input an object or situation described by a set of properties, and outputs a yes/no decision. Decision trees therefore represent Boolean functions. Functions with a larger range of outputs can also be represented...."
– Artificial Intelligence: A Modern Approach. By Stuart Russell & Peter Norvig. 2002. Section 18.3; page 531.

News Overviews Instructional Materials AI-Alerts Classics

RFCDE: Random Forests for Conditional Density Estimation

Pospisil, Taylor, Lee, Ann B.

arXiv.org Machine LearningMay-2-2018

Random forests is a common non-parametric regression technique which performs well for mixed-type data and irrelevant covariates, while being robust to monotonic variable transformations. Existing random forest implementations target regression or classification. We introduce the RFCDE package for fitting random forest models optimized for nonparametric conditional density estimation, including joint densities for multiple responses. This enables analysis of conditional probability distributions which is useful for propagating uncertainty and of joint distributions that describe relationships between multiple responses and covariates. RFCDE is released under the MIT open-source license and can be accessed at https://github.com/tpospisi/rfcde . Both R and Python versions, which call a common C++ library, are available.

artificial intelligence, density estimation, machine learning, (15 more...)

arXiv.org Machine Learning

1804.05753

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.96)

Add feedback

Decision Tree Design for Classification in Crowdsourcing Systems

Geng, Baocheng, Li, Qunwei, Varshney, Pramod K.

arXiv.org Machine LearningMay-1-2018

In recent work on classification in crowdsourcing systems, complex questions are often replaced by a set of simpler binary questions (microtasks) to enhance classification performance [1]-[4]. This is especially helpful in situations where crowd workers lack expertise for responding to complex questions directly. Each worker is given the entire set of questions in a batch mode and the workers provide their responses in the form of a vector. These binary questions can be posted as "microtasks" on crowdsourcing platforms like Amazon Mechanical Turk [5]. To improve classification performance in crowdsourcing systems, most of the works in the literature focus on enhancing the quality of individual tests, by designing fusion rules to combine decisions from heterogeneous workers [1]-[4], [6], [7], and by investigating the assignment of different tests to different workers depending upon their skill level [8], [9].

artificial intelligence, machine learning, social media, (18 more...)

arXiv.org Machine Learning

1805.00559

Country: North America (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Gradient Boosting vs Random Forest – Abolfazl Ravanshad – Medium

#artificialintelligenceApr-28-2018, 09:30:57 GMT

In this post, I am going to compare two popular ensemble methods, Random Forests (RM) and Gradient Boosting Machine (GBM). GBM and RF both are ensemble learning methods and predict (regression or classification) by combining the outputs from individual trees (we assume tree-based GBM or GBT). They have all the strengths and weaknesses of the ensemble methods mentioned in my previous post. So, here we compare them only with respect to each other. GBM and RF differ in the way the trees are built: the order and the way the results are combined.

application, artificial intelligence, machine learning, (7 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)

Add feedback

Handling Missing Values using Decision Trees with Branch-Exclusive Splits

Beaulac, Cédric, Rosenthal, Jeffrey S.

arXiv.org Machine LearningApr-26-2018

In this article we propose a new decision tree construction algorithm. The proposed approach allows the algorithm to interact with some predictors that are only defined in subspaces of the feature space. One way to utilize this new tool is to create or use one of the predictors to keep track of missing values. This predictor can later be used to define the subspace where predictors with missing values are available for the data partitioning process. By doing so, this new classification tree can handle missing values for both modelling and prediction. The algorithm is tested against simulated and real data. The result is a classification procedure that efficiently handles missing values and produces results that are more accurate and more interpretable than most common procedures.

artificial intelligence, machine learning, predictor, (17 more...)

arXiv.org Machine Learning

1804.10168

Country:

North America > United States > California (0.28)
North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (0.83)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Decision Tree Classification models to predict employee turnover

#artificialintelligenceApr-21-2018, 22:36:33 GMT

In this project I have attempted to create supervised learning models to assist in classifying certain employee data. I pre-processed the data by removing one outlier and producing new features in Excel as the data set was small at 1056 rows. Some categorical features were also converted to numeric values in Excel. For example, Gender was originally "M" or "F", which was converted to 0 and 1 respectively. I also removed employee number as it provides no value as a feature and could compromise privacy.

artificial intelligence, classifier, machine learning, (13 more...)

#artificialintelligence

Industry: Education (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.78)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.51)

Add feedback

MetaBags: Bagged Meta-Decision Trees for Regression

Khiari, Jihed, Moreira-Matias, Luis, Shaker, Ammar, Zenko, Bernard, Dzeroski, Saso

arXiv.org Machine LearningApr-17-2018

Ensembles are popular methods for solving practical supervised learning problems. They reduce the risk of having underperforming models in production-grade software. Although critical, methods for learning heterogeneous regression ensembles have not been proposed at large scale, whereas in classical ML literature, stacking, cascading and voting are mostly restricted to classification problems. Regression poses distinct learning challenges that may result in poor performance, even when using well established homogeneous ensemble schemas such as bagging or boosting. In this paper, we introduce MetaBags, a novel, practically useful stacking framework for regression. MetaBags is a meta-learning algorithm that learns a set of meta-decision trees designed to select one base model (i.e. expert) for each query, and focuses on inductive bias reduction. A set of meta-decision trees are learned using different types of meta-features, specially created for this purpose - to then be bagged at meta-level. This procedure is designed to learn a model with a fair bias-variance trade-off, and its improvement over base model performance is correlated with the prediction diversity of different experts on specific input space subregions. The proposed method and meta-features are designed in such a way that they enable good predictive performance even in subregions of space which are not adequately represented in the available training data. An exhaustive empirical testing of the method was performed, evaluating both generalization error and scalability of the approach on synthetic, open and real-world application datasets. The obtained results show that our method significantly outperforms existing state-of-the-art approaches.

artificial intelligence, machine learning, metabag, (16 more...)

arXiv.org Machine Learning

1804.06207

Country:

North America > United States (0.14)
Europe (0.14)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Transportation (0.67)
Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

LEARNING PATH: R: Machine Learning Algorithms with R

@machinelearnbotApr-16-2018, 14:05:17 GMT

Are you interested to explore advanced algorithm concepts such as random forest vector machine, K- nearest, and more through real-world examples? Packt's Video Learning Paths are a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it. Machine learning and data science are some of the top buzzwords in the technical world today. Machine learning - the application and science of algorithms that makes sense of data, is the most exciting field of all the computer sciences! It explores the study and construction of algorithms that can learn from and make predictions on data.

algorithm, learning path, machine learning algorithm, (3 more...)

@machinelearnbot

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.42)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Data Mining with Rattle Udemy

@machinelearnbotApr-12-2018, 02:08:13 GMT

Data Mining with Rattle is a unique course that instructs with respect to both the concepts of data mining, as well as to the "hands-on" use of a popular, contemporary data mining software tool, "Data Miner," also known as the'Rattle' package in R software. Rattle is a popular GUI-based software tool which'fits on top of' R software. The course focuses on life-cycle issues, processes, and tasks related to supporting a'cradle-to-grave' data mining project. These include: data exploration and visualization; testing data for random variable family characteristics and distributional assumptions; transforming data by scale or by data type; performing cluster analyses; creating, analyzing and interpreting association rules; and creating and evaluating predictive models that may utilize: regression; generalized linear modeling (GLMs); decision trees; recursive partitioning; random forests; boosting; and/or support vector machine (SVM) paradigms. It is both a conceptual and a practical course as it teaches and instructs about data mining, and provides ample demonstrations of conducting data mining tasks using the Rattle R package. The course is ideal for undergraduate students seeking to master additional'in-demand' analytical job skills to offer a prospective employer.

data mining, rattle udemy, software tool, (2 more...)

@machinelearnbot

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Materials > Metals & Mining (0.61)
Education > Educational Setting > Higher Education (0.41)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.61)

Add feedback

Asynchronous Parallel Sampling Gradient Boosting Decision Tree

Daning, Cheng, Fen, Xia, Shigang, Li, Yunquan, Zhang

arXiv.org Machine LearningApr-12-2018

With the development of big data technology, Gradient Boosting Decision Tree, i.e. GBDT, becomes one of the most important machine learning algorithms for its accurate output. However, the training process of GBDT needs a lot of computational resources and time. In order to accelerate the training process of GBDT, the asynchronous parallel sampling gradient boosting decision tree, abbr. asynch-SGBDT is proposed in this paper. Via introducing sampling, we adapt the numerical optimization process of traditional GBDT training process into stochastic optimization process and use asynchronous parallel stochastic gradient descent to accelerate the GBDT training process. Meanwhile, the theoretical analysis of asynch-SGBDT is provided by us in this paper. Experimental results show that GBDT training process could be accelerated by asynch-SGBDT. Our asynchronous parallel strategy achieves an almost linear speedup, especially for high-dimensional sparse datasets.

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Machine Learning

1804.04659

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)

Add feedback

Create Your Own Sophisticated Model with Neural Networks

@machinelearnbotApr-11-2018, 06:02:53 GMT

Scikit-learn has evolved as a robust library for Machine Learning applications in Python with support for a wide range of Supervised and Unsupervised Learning Algorithms. With this course you will learn the Decision Tree algorithms and Ensemble Models to build Random Forest, Regression Analysis. You will focus on Decision Trees and Ensemble Algorithms. Moving forward, you learn to use scikit-learn to classify text and Multiclass with scikit-learn. You will explore various algorithms for classification.

algorithm, neural network, own sophisticated model, (2 more...)

@machinelearnbot

Country: North America > United States > Massachusetts (0.08)

Genre: Instructional Material > Course Syllabus & Notes (0.74)

Industry:

Banking & Finance > Trading (0.40)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.86)

Add feedback