Goto

Collaborating Authors

 Association Learning


101 ML Algorithms

#artificialintelligence

The algorithms have been sorted into 9 groups: Anomaly Detection, Association Rule Learning, Classification, Clustering, Dimensional Reduction, Ensemble, Neural Networks, Regression, Regularization. In this post, you'll find 101 machine learning algorithms, including useful cheat sheets to help you know when to use each one (if available). At Data Science Dojo, our mission is to make data science (machine learning in this case) available to everyone. Whether you join our data science bootcamp, read our blog, or watch our tutorials, we want everyone to have the opportunity to learn data science. Having said that, each accordion dropdown is embeddable if you want to take them with you.


On the Trade-off Between Consistency and Coverage in Multi-label Rule Learning Heuristics

Rapp, Michael, Mencía, Eneldo Loza, Fürnkranz, Johannes

arXiv.org Machine Learning

Recently, several authors have advocated the use of rule learning algorithms to model multi-label data, as rules are interpretable and can be comprehended, analyzed, or qualitatively evaluated by domain experts. Many rule learning algorithms employ a heuristic-guided search for rules that model regularities contained in the training data and it is commonly accepted that the choice of the heuristic has a significant impact on the predictive performance of the learner. Whereas the properties of rule learning heuristics have been studied in the realm of single-label classification, there is no such work taking into account the particularities of multi-label classification. This is surprising, as the quality of multi-label predictions is usually assessed in terms of a variety of different, potentially competing, performance measures that cannot all be optimized by a single learner at the same time. In this work, we show empirically that it is crucial to trade off the consistency and coverage of rules differently, depending on which multi-label measure should be optimized by a model. Based on these findings, we emphasize the need for configurable learners that can flexibly use different heuristics. As our experiments reveal, the choice of the heuristic is not straight-forward, because a search for rules that optimize a measure locally does usually not result in a model that maximizes that measure globally.


Understanding Association Rule Learning & Its Role In Data Mining

#artificialintelligence

Data Mining enables users to analyse, classify and discover correlations among data. One of the crucial tasks of this process is Association Rule Learning. An important part of data mining is anomaly detection, which is a procedure of search for items or events that do not correspond to a familiar pattern. These familiar patterns are termed anomalies and interpret critical and actionable data in various application fields. This concept can be best understood with the supermarket example.


Interpretable preference learning: a game theoretic framework for large margin on-line feature and rule learning

Polato, Mirko, Aiolli, Fabio

arXiv.org Artificial Intelligence

A large body of research is currently investigating on the connection between machine learning and game theory. In this work, game theory notions are injected into a preference learning framework. Specifically, a preference learning problem is seen as a two-players zero-sum game. An algorithm is proposed to incrementally include new useful features into the hypothesis. This can be particularly important when dealing with a very large number of potential features like, for instance, in relational learning and rule extraction. A game theoretical analysis is used to demonstrate the convergence of the algorithm. Furthermore, leveraging on the natural analogy between features and rules, the resulting models can be easily interpreted by humans. An extensive set of experiments on classification tasks shows the effectiveness of the proposed method in terms of interpretability and feature selection quality, with accuracy at the state-of-the-art.


GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

Sikora, Marek, Wróbel, Łukasz, Gudyś, Adam

arXiv.org Machine Learning

GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings Marek Sikora a,b,, Łukasz Wróbel a,b,, Adam Gudyś a, a Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland b Institute of Innovative Technologies, EMAG, Leopolda 31, 40-189 Katowice, PolandAbstract This article presents GuideR, a user-guided rule induction algorithm, which overcomes the largest limitation of the existing methods---the lack of the possibility to introduce user's preferences or domain knowledge to the rule learning process. Automatic selection of attributes and attribute ranges often leads to the situation in which resulting rules do not contain interesting information. We propose an induction algorithm which takes into account user's requirements. Our method uses the sequential covering approach and is suitable for classification, regression, and survival analysis problems. The effectiveness of the algorithm in all these tasks has been verified experimentally, confirming guided rule induction to be a powerful data analysis tool. Introduction Sequential covering rule induction algorithms can be used for both, predictive and descriptive purposes [1, 2, 3, 4]. In spite of the development of increasingly sophisticated versions of those algorithms [5, 6], the main principle remains unchanged and involves two phases: rule growing and rule pruning. In the latter, some of these conditions are removed. In comparison to other machine learning methods, rule sets obtained by sequential covering algorithm, also known as separate-and-conquer strategy (SnC), are characterized by good predictive as well as descriptive capabilities. Taking into consideration only the former, superior results can often be obtained using other methods, e.g. However, data models obtained this way are much less comprehensible than rule sets. In the case of rule learning for descriptive purposes, the algorithms of association rule induction [12, 13, 14] or subgroup discovery [15, 6], are applied. The former leads to a very large number of rules which must then be limited by filtering according to rule interestingness measures [16, 17, 18]. Nevertheless, rule sets obtained by subgroup discovery are characterized by worse predictive abilities than those generated by the standard sequential covering approach. Therefore, if creating a prediction system with comprehensible data model is the main objective, the application of sequential covering rule induction algorithms provides the most sensible solution.


MachineX: Understanding FP-Tree Construction - DZone AI

#artificialintelligence

In my previous blog, MachineX: Why No One Uses an Apriori Algorithm for Association Rule Learning, we discussed one of the first algorithms in association rule learning, Apriori algorithm. Although, even after being so simple and clear, it has some weaknesses as discussed in the above-mentioned blog. A significant improvement over the Apriori algorithm is the FP-Growth algorithm. To understand how the FP-Growth algorithm helps in finding frequent items, we first have to understand the data structure used by it to do so, the FP-Tree, which will be our focus in this blog. To put it simply, an FP-Tree is a compressed representation of the input data.


Can Machine Learning Correct Commonly Accepted Knowledge and Provide Understandable Knowledge in Care Support Domain? Tackling Cognitive Bias and Humanity from Machine Learning Perspective

Takadama, Keiki (The University of Electro-Communications)

AAAI Conferences

This paper focuses on care support knowledge (especially focuses on the sleep related knowledge) and tackles its cognitive bias and humanity aspects from machine learning perspective through discussion of whether machine learning can correct commonly accepted knowledge and provide understandable knowledge in care support domain. For this purpose, this paper starts by introducing our data mining method (based on association rule learning) that can provide only necessary number of understandable knowledge without probabilities even if its accuracy slightly becomes worse, and shows its effectiveness in care plans support systems for aged persons as one of healthcare systems. The experimental result indicates that (1) our method can extract a few simple knowledge as understandable knowledge that clarifies what kinds of activities (e.g., rehabilitation, bathing) in care house contribute to having a deep sleep, but (2) the apriori algorithm as one of major association rule learning methods is hard to provide such knowledge because it needs calculate all combinations of activities executed by aged persons.


More Data Mining with R Udemy

@machinelearnbot

In data mining and association rule learning, lift is a measure of the performance of a targeting model (association rule) at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model. A targeting model is doing a good job if the response within the target is much better than the average for the population as a whole. Lift is simply the ratio of these values: target response divided by average response. For example, suppose a population has an average response rate of 5%, but a certain model (or rule) has identified a segment with a response rate of 20%. Then that segment would have a lift of 4.0 (20%/5%).


Lift (data mining) - Wikipedia

#artificialintelligence

In data mining and association rule learning, lift is a measure of the performance of a targeting model (association rule) at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model. A targeting model is doing a good job if the response within the target is much better than the average for the population as a whole. Lift is simply the ratio of these values: target response divided by average response. For example, suppose a population has an average response rate of 5%, but a certain model (or rule) has identified a segment with a response rate of 20%. Then that segment would have a lift of 4.0 (20%/5%).


GPU-Accelerated Parameter Optimization for Classification Rule Learning

Harris, Greg (University of Southern California) | Panangadan, Anand (California State University, Fullerton) | Prasanna, Viktor K. (University of Southern California)

AAAI Conferences

While some studies comparing rule-based classifiers enumerate a parameter over several values, most use all default values, presumably due to the high computational cost of jointly tuning multiple parameters. We show that thorough, joint optimization of search parameters on individual datasets gives higher out-of-sample precision than fixed baselines. We test on 1,000 relatively large synthetic datasets with widely-varying properties. We optimize heuristic beam search with the m-estimate interestingness measure. We jointly tune m, the beam size, and the maximum rule length. The beam size controls the extent of search, where over-searching can find spurious rules. m controls the bias toward higher-frequency rules, with the optimal value depending on the amount of noise in the dataset. We assert that such hyper-parameters affecting the frequency bias and extent of search should be optimized simultaneously, since both directly affect the false-discovery rate. While our method based on grid search and cross-validation is computationally intensive, we show that it can be massively parallelized, with our GPU implementation providing up to 28x speedup over a comparable multi-threaded CPU implementation.