AITopics

1909.05032

Country: Europe (0.68)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.35)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.34)

Gupta, Prashant, Jindal, Aashi, Jayadeva, null, Sengupta, Debarka

Guided Random Forest and its application to data approximation

arXiv.org Machine LearningSep-2-2019

We present a new way of constructing an ensemble classifier, named the Guided Random Forest (GRAF) in the sequel. GRAF extends the idea of building oblique decision trees with localized partitioning to obtain a global partitioning. We show that global partitioning bridges the gap between decision trees and boosting algorithms. We empirically demonstrate that global partitioning reduces the generalization error bound. Results on 115 benchmark datasets show that GRAF yields comparable or better results on a majority of datasets. We also present a new way of approximating the datasets in the framework of random forests.

artificial intelligence, machine learning, partition, (16 more...)

1909.00659

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

#artificialintelligenceSep-1-2019, 08:41:24 GMT

Explaining Predictions: Random Forest Post-hoc Analysis (randomForestExplainer package)

We can further evaluate the variable interactions by plotting the probability of a prediction against the variables making up the interaction. The interaction of these two variables are the most frequent interaction as seen in plot_min_depth_interactions. We plot the forest prediction against interactive variables with plot_predict_interaction. However, there is an error when the input supplied is a model created with parsnip. There is no error when the model is created directly from the randomForest package.

artificial intelligence, decision tree learning, machine learning, (9 more...)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.40)

Dockhorn, Alexander, Lucas, Simon M., Volz, Vanessa, Bravi, Ivan, Gaina, Raluca D., Perez-Liebana, Diego

Learning Local Forward Models on Unforgiving Games

arXiv.org Artificial IntelligenceSep-1-2019

This paper examines learning approaches for forward models based on local cell transition functions. We provide a formal definition of local forward models for which we propose two basic learning approaches. Our analysis is based on the game Sokoban, where a wrong action can lead to an unsolvable game state. Therefore, an accurate prediction of an action's resulting state is necessary to avoid this scenario. In contrast to learning the complete state transition function, local forward models allow extracting multiple training examples from a single state transition. In this way, the Hash Set model, as well as the Decision Tree model, quickly learn to predict upcoming state transitions of both the training and the test set. Applying the model using a statistical forward planner showed that the best models can be used to satisfying degree even in cases in which the test levels have not yet been seen. Our evaluation includes an analysis of various local neighbourhood patterns and sizes to test the learners' capabilities in case too few or too many attributes are extracted, of which the latter has shown do degrade the performance of the model learner.

artificial intelligence, decision tree learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1909.00442

Country: Europe (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Wong, Hallee E., Heggeseth, Brianna C., Miller, Steven J.

Categorical Co-Frequency Analysis: Clustering Diagnosis Codes to Predict Hospital Readmissions

arXiv.org Machine LearningAug-31-2019

Accurately predicting patients' risk of 30-day hospital readmission would enable hospitals to efficiently allocate resource-intensive interventions. We develop a new method, Categorical Co-Frequency Analysis (CoFA), for clustering diagnosis codes from the International Classification of Diseases (ICD) according to the similarity in relationships between covariates and readmission risk. CoFA measures the similarity between diagnoses by the frequency with which two diagnoses are split in the same direction versus split apart in random forests to predict readmission risk. Applying CoFA to de-identified data from Berkshire Medical Center, we identified three groups of diagnoses that vary in readmission risk. To evaluate CoFA, we compared readmission risk models using ICD majors and CoFA groups to a baseline model without diagnosis variables. We found substituting ICD majors for the CoFA-identified clusters simplified the model without compromising the accuracy of predictions. Fitting separate models for each ICD major and CoFA group did not improve predictions, suggesting that readmission risk may be more homogeneous that heterogeneous across diagnosis groups.

artificial intelligence, diagnosis, machine learning, (18 more...)

1909.00306

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Health Care Providers & Services > Reimbursement (1.00)
Government > Regional Government > North America Government > United States Government (0.95)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.94)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

#artificialintelligenceAug-30-2019, 07:12:25 GMT

Churn prediction

Customer churn, also known as customer attrition, occurs when customers stop doing business with a company. The companies are interested in identifying segments of these customers because the price for acquiring a new customer is usually higher than retaining the old one. For example, if Netflix knew a segment of customers who were at risk of churning they could proactively engage them with special offers instead of simply losing them. In this post, we will create a simple customer churn prediction model using Telco Customer Churn dataset. We chose a decision tree to model churned customers, pandas for data crunching and matplotlib for visualizations.

artificial intelligence, customer, machine learning, (19 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.55)

#artificialintelligenceAug-28-2019, 07:05:49 GMT

VariantSpark, A Random Forest Machine Learning Implementation for Ultra High Dimensional Data

The demands on machine learning methods to cater for ultra high dimensional datasets, datasets with millions of features, have been increasing in domains like life sciences and the Internet of Things (IoT). While Random Forests are suitable for "wide" datasets, current implementations such as Google's PLANET lack the ability to scale to such dimensions. Recent improvements by Yggdrasil begin to address these limitations but do not extend to Random Forest. This paper introduces CursedForest, a novel Random Forest implementation on top of Apache Spark and part of the VariantSpark platform, which parallelises processing of all nodes over the entire forest. CursedForest is 9 and up to 89 times faster than Google's PLANET and Yggdrasil, respectively, and is the first method capable of scaling to millions of features.

random forest machine learning implementation, ultra high dimensional data, variantspark, (4 more...)

Industry:

Materials > Paper & Forest Products > Forest Products (0.40)
Machinery > Agricultural & Farm Machinery (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Qafari, Mahnaz Sadat, van der Aalst, Wil

Fairness-Aware Process Mining

arXiv.org Artificial IntelligenceAug-28-2019

Process mining is a multi-purpose tool enabling organizations to improve their processes. One of the primary purposes of process mining is finding the root causes of performance or compliance problems in processes. The usual way of doing so is by gathering data from the process event log and other sources and then applying some data mining and machine learning techniques. However, the results of applying such techniques are not always acceptable. In many situations, this approach is prone to making obvious or unfair diagnoses and applying them may result in conclusions that are unsurprising or even discriminating (e.g., blaming overloaded employees for delays). In this paper, we present a solution to this problem by creating a fair classifier for such situations. The undesired effects are removed at the expense of reduction on the accuracy of the resulting classifier. We have implemented this method as a plug-in in ProM. Using the implemented plug-in on two real event logs, we decreased the discrimination caused by the classifier, while losing a small fraction of its accuracy.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1908.11451

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.36)

#artificialintelligenceAug-26-2019, 01:35:51 GMT

Comparing Decision Tree Algorithms: Random Forest vs. XGBoost

This tutorial walks you through a comparison of XGBoost and Random Forest, two popular decision tree algorithms, and helps you identify the best use cases for ensemble techniques like bagging and boosting. By following the tutorial, you'll learn: Understanding the benefits of bagging and boosting--and knowing when to use which technique--will lead to less variance, lower bias, and more stability in your machine learning models.

decision tree algorithm, decision tree learning, machine learning, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Sarker, Iqbal H., Salah, Khaled

AppsPred: Predicting Context-Aware Smartphone Apps using Random Forest Learning

arXiv.org Machine LearningAug-26-2019

Due to the popularity of context-awareness in the Internet of Things (IoT) and the recent advanced features in the most popular IoT device, i.e., smartphone, modeling and predicting personalized usage behavior based on relevant contexts can be highly useful in assisting them to carry out daily routines and activities. Usage patterns of different categories smartphone apps such as social networking, communication, entertainment, or daily life services related apps usually vary greatly between individuals. People use these apps differently in different contexts, such as temporal context, spatial context, individual mood and preference, work status, Internet connectivity like Wifi? status, or device related status like phone profile, battery level etc. Thus, we consider individuals' apps usage as a multi-class context-aware problem for personalized modeling and prediction. Random Forest learning is one of the most popular machine learning techniques to build a multi-class prediction model. Therefore, in this paper, we present an effective context-aware smartphone apps prediction model, and name it "AppsPred" using random forest machine learning technique that takes into account optimal number of trees based on such multi-dimensional contexts to build the resultant forest. The effectiveness of this model is examined by conducting experiments on smartphone apps usage datasets collected from individual users. The experimental results show that our AppsPred significantly outperforms other popular machine learning classification approaches like ZeroR, Naive Bayes, Decision Tree, Support Vector Machines, Logistic Regression while predicting smartphone apps in various context-aware test cases.

artificial intelligence, decision tree, machine learning, (18 more...)

1909.12949

Country:

Europe (1.00)
Oceania > Australia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Telecommunications (1.00)
Information Technology > Software (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)