Goto

Collaborating Authors

 Association Learning


Identifying the Leading Factors of Significant Weight Gains Using a New Rule Discovery Method

Samizadeh, Mina, Jones-Smith, Jessica C, Sheridan, Bethany, Beheshti, Rahmatollah

arXiv.org Artificial Intelligence

Overweight and obesity remain a major global public health concern and identifying the individualized patterns that increase the risk of future weight gains has a crucial role in preventing obesity and numerous sub-sequent diseases associated with obesity. In this work, we use a rule discovery method to study this problem, by presenting an approach that offers genuine interpretability and concurrently optimizes the accuracy(being correct often) and support (applying to many samples) of the identified patterns. Specifically, we extend an established subgroup-discovery method to generate the desired rules of type X -> Y and show how top features can be extracted from the X side, functioning as the best predictors of Y. In our obesity problem, X refers to the extracted features from very large and multi-site EHR data, and Y indicates significant weight gains. Using our method, we also extensively compare the differences and inequities in patterns across 22 strata determined by the individual's gender, age, race, insurance type, neighborhood type, and income level. Through extensive series of experiments, we show new and complementary findings regarding the predictors of future dangerous weight gains.


UNSUPERVISED LEARNING

#artificialintelligence

Unsupervised learning is where only the input data is present and no corresponding output variable is there. Unsupervised learning has a lot of potential ranging anywhere from fraud detection to stock trading. Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior. Association: An association rule learning problem is where you want to discover rules that describe a large portion of your data. Association rules mining are used to identify new and interesting insights between different objects in a set, frequent pattern in transactional data or any sort of relational database.


Top 5 data mining technique in Machine Learning (ML)

#artificialintelligence

Data mining is a popular term used by machine learning developers. The technique refers to extracting meaningful information from the massive dataset. For the aspiring data scientists, it is important to be familiar with data mining techniques. Here are the top data mining techniques that are used by Data Science and Machine Learning experts. Association rule learning is a standard rule-based ML technique used to discover the relationship between variables in datasets.


Association Rule Learning & APriori Algorithm

#artificialintelligence

Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness. Association Rules find all sets of items (itemsets) that have support greater than the minimum support and then using the large itemsets to generate the desired rules that have confidence greater than the minimum confidence. The lift of a rule is the ratio of the observed support to that expected if X and Y were independent. A typical and widely used example of association rules application is market basket analysis.


Kristen Doute says she's learning about 'unconscious bias' after 'Vanderpump Rules' firing

FOX News

Fox News Flash top entertainment and celebrity headlines are here. Check out what's clicking today in entertainment. Kristen Doute said she is doing some soul-searching after being fired from "Vanderpump Rules" along with castmate Stassi Schroeder for past racially insensitive actions involving former Black cast member Faith Stowers. The 37-year-old spoke about how she's changing and growing as a person on the "Hollywood Raw" podcast with Dax Holt and Adam Glyn. "It was definitely none of my business to take anything to social media [and] essentially send a mob out to this person. It was really just not my place to go there," she said.


Discovering Hierarchies for Reinforcement Learning Using Data Mining

Mobley, Dave (University of Kentucky) | Goldsmith, Judy (University of Kentucky) | Harrison, Brent (University of Kentucky)

AAAI Conferences

Reinforcement Learning has the limitation that problems become too large very quickly. Dividing the problem into a hierarchy of subtasks allows for a strategy of divide and conquer, which is what makes Hierarchical Reinforcement Learning (HRL) algorithms often more efficient at finding solutions quicker than more naive approaches. One of the biggest challenges with HRL is the construction of a hierarchy to be used by the algorithm. Hierarchies are often designed by a person using their own knowledge of the problem. We propose method for automatically discovering task hierarchies based on a data mining technique, Association Rule Learning (ARL). These hierarchies can then be applied to Semi-Markov Decision Process (SMDP) problems using the options technique


Association Learning

#artificialintelligence

Association learning is a rule based machine learning and data mining technique that finds important relations between variables or features in a data set. Unlike conventional association algorithms measuring degrees of similarity, association rule learning identifies hidden correlations in databases by applying some measure of interestingness to generate an association rule for new searches.


SCR-Apriori for Mining `Sets of Contrasting Rules'

Aleksandrova, Marharyta, Chertov, Oleg

arXiv.org Machine Learning

--In this paper, we propose an efficient algorithm for mining novel'Set of Contrasting Rules'-pattern (SCR-pattern), which consists of several association rules. This pattern is of high interest due to the guaranteed quality of the rules forming it and its ability to discover useful knowledge. However, SCR-pattern has no efficient mining algorithm. We propose SCR-Apriori algorithm, which results in the same set of SCR-patterns as the state-of-the-art approache, but is less computationally expensive. We also show experimentally that by incorporating the knowledge about the pattern structure into Apriori algorithm, SCR-Apriori can significantly prune the search space of frequent itemsets to be analysed. I NTRODUCTION Association rules learning is a popular technique in data mining [1]. However, it is known that finding rules of high quality is not always an easy task [2]. This issue is even more significant in domains where the reliability of the obtained knowledge is required to be high (for example, in medicine). Also, association rules mining techniques usually generate a huge number of rules that have to be analysed by a human in order to choose meaningful and useful ones [3].


Fast Dimensional Analysis for Root Cause Investigation in Large-Scale Service Environment

Lin, Fred, Muzumdar, Keyur, Laptev, Nikolay Pavlovich, Curelea, Mihai-Valentin, Lee, Seunghak, Sankar, Sriram

arXiv.org Machine Learning

Root cause analysis in a large-scale production environment is challenging due to the complexity of services running across global data centers. Due to the distributed nature of a large-scale system, the various hardware, software, and tooling logs are often maintained separately, making it difficult to review the logs jointly for detecting issues. Another challenge in reviewing the logs for identifying issues is the scale - there could easily be millions of entities, each with hundreds of features. In this paper we present a fast dimensional analysis framework that automates the root cause analysis on structured logs with improved scalability. We first explore item-sets, i.e. a group of feature values, that could identify groups of samples with sufficient support for the target failures using the Apriori algorithm and a subsequent improvement, FP-Growth. These algorithms were designed for frequent item-set mining and association rule learning over transactional databases. After applying them on structured logs, we select the item-sets that are most unique to the target failures based on lift. With the use of a large-scale real-time database, we propose pre- and post-processing techniques and parallelism to further speed up the analysis. We have successfully rolled out this approach for root cause investigation purposes in a large-scale infrastructure. We also present the setup and results from multiple production use-cases in this paper.


RuDaS: Synthetic Datasets for Rule Learning and Evaluation Tools

Cornelio, Cristina, Thost, Veronika

arXiv.org Artificial Intelligence

Logical rules are a popular knowledge representation language in many domains, representing background knowledge and encoding information that can be derived from given facts in a compact form. However, rule formulation is a complex process that requires deep domain expertise, and is further challenged by today's often large, heterogeneous, and incomplete knowledge graphs. Several approaches for learning rules automatically, given a set of input example facts, have been proposed over time, including, more recently, neural systems. Yet, the area is missing adequate datasets and evaluation approaches: existing datasets often resemble toy examples that neither cover the various kinds of dependencies between rules nor allow for testing scalability. We present a tool for generating different kinds of datasets and for evaluating rule learning systems.