Aleksandrova, Marharyta, Chertov, Oleg

--In this paper, we propose an efficient algorithm for mining novel'Set of Contrasting Rules'-pattern (SCR-pattern), which consists of several association rules. This pattern is of high interest due to the guaranteed quality of the rules forming it and its ability to discover useful knowledge. However, SCR-pattern has no efficient mining algorithm. We propose SCR-Apriori algorithm, which results in the same set of SCR-patterns as the state-of-the-art approache, but is less computationally expensive. We also show experimentally that by incorporating the knowledge about the pattern structure into Apriori algorithm, SCR-Apriori can significantly prune the search space of frequent itemsets to be analysed. I NTRODUCTION Association rules learning is a popular technique in data mining [1]. However, it is known that finding rules of high quality is not always an easy task [2]. This issue is even more significant in domains where the reliability of the obtained knowledge is required to be high (for example, in medicine). Also, association rules mining techniques usually generate a huge number of rules that have to be analysed by a human in order to choose meaningful and useful ones [3].

The algorithms have been sorted into 9 groups: Anomaly Detection, Association Rule Learning, Classification, Clustering, Dimensional Reduction, Ensemble, Neural Networks, Regression, Regularization. In this post, you'll find 101 machine learning algorithms, including useful infographics to help you know when to use each one (if available). Each of the accordian drop downs are embeddable if you want to take them with you. All you have to do is click the little'embed' button in the lower left hand corner and copy/paste the iframe. All we ask is you link back to this post.

The algorithms have been sorted into 9 groups: Anomaly Detection, Association Rule Learning, Classification, Clustering, Dimensional Reduction, Ensemble, Neural Networks, Regression, Regularization. In this post, you'll find 101 machine learning algorithms, including useful cheat sheets to help you know when to use each one (if available). At Data Science Dojo, our mission is to make data science (machine learning in this case) available to everyone. Whether you join our data science bootcamp, read our blog, or watch our tutorials, we want everyone to have the opportunity to learn data science. Having said that, each accordion dropdown is embeddable if you want to take them with you.

Data Mining enables users to analyse, classify and discover correlations among data. One of the crucial tasks of this process is Association Rule Learning. An important part of data mining is anomaly detection, which is a procedure of search for items or events that do not correspond to a familiar pattern. These familiar patterns are termed anomalies and interpret critical and actionable data in various application fields. This concept can be best understood with the supermarket example.

In my previous blog, MachineX: Why No One Uses an Apriori Algorithm for Association Rule Learning, we discussed one of the first algorithms in association rule learning, Apriori algorithm. Although, even after being so simple and clear, it has some weaknesses as discussed in the above-mentioned blog. A significant improvement over the Apriori algorithm is the FP-Growth algorithm. To understand how the FP-Growth algorithm helps in finding frequent items, we first have to understand the data structure used by it to do so, the FP-Tree, which will be our focus in this blog. To put it simply, an FP-Tree is a compressed representation of the input data.

Takadama, Keiki (The University of Electro-Communications)

This paper focuses on care support knowledge (especially focuses on the sleep related knowledge) and tackles its cognitive bias and humanity aspects from machine learning perspective through discussion of whether machine learning can correct commonly accepted knowledge and provide understandable knowledge in care support domain. For this purpose, this paper starts by introducing our data mining method (based on association rule learning) that can provide only necessary number of understandable knowledge without probabilities even if its accuracy slightly becomes worse, and shows its effectiveness in care plans support systems for aged persons as one of healthcare systems. The experimental result indicates that (1) our method can extract a few simple knowledge as understandable knowledge that clarifies what kinds of activities (e.g., rehabilitation, bathing) in care house contribute to having a deep sleep, but (2) the apriori algorithm as one of major association rule learning methods is hard to provide such knowledge because it needs calculate all combinations of activities executed by aged persons.

Takadama, Keiki (The University of Electro-Communications)

In data mining and association rule learning, lift is a measure of the performance of a targeting model (association rule) at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model. A targeting model is doing a good job if the response within the target is much better than the average for the population as a whole. Lift is simply the ratio of these values: target response divided by average response.

In data mining and association rule learning, lift is a measure of the performance of a targeting model (association rule) at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model. A targeting model is doing a good job if the response within the target is much better than the average for the population as a whole. Lift is simply the ratio of these values: target response divided by average response. For example, suppose a population has an average response rate of 5%, but a certain model (or rule) has identified a segment with a response rate of 20%. Then that segment would have a lift of 4.0 (20%/5%).