Collaborating Authors

Association Learning

101 Machine Learning Algorithms for Data Science with Cheat Sheets


These 101 algorithms are equipped with cheat sheets, tutorials, and explanations. Think of this as the one-stop shop/dictionary/directory for machine learning algorithms. The algorithms have been sorted into 9 groups: Anomaly Detection, Association Rule Learning, Classification, Clustering, Dimensional Reduction, Ensemble, Neural Networks, Regression, Regularization. In this post, you'll find 101 machine learning algorithms with useful Python tutorials, R tutorials, and cheat sheets from Microsoft Azure ML, SAS, and Scikit-Learn to help you know when to use each one (if available). At Data Science Dojo, our mission is to make data science (machine learning in this case) available to everyone.


AAAI Conferences

Reinforcement Learning has the limitation that problems become too large very quickly. Dividing the problem into a hierarchy of subtasks allows for a strategy of divide and conquer, which is what makes Hierarchical Reinforcement Learning (HRL) algorithms often more efficient at finding solutions quicker than more naive approaches. One of the biggest challenges with HRL is the construction of a hierarchy to be used by the algorithm. Hierarchies are often designed by a person using their own knowledge of the problem. We propose method for automatically discovering task hierarchies based on a data mining technique, Association Rule Learning (ARL).


AAAI Conferences

Feedback on player experience and behaviour can be invaluable to game designers, but there is need for specialised knowledge discovery tools to deal with high volume playtest data. We describe a study witha commercial third-person shooter, in which integrated player activity and experience data was captured and mined for design-relevant knowledge. We demonstrate that association rule learning and rule templates can be used to extractmeaningful rules relating player activity and experience during combat. We found that the number, type and quality of rules varies between experiences, and is affected by feature distributions. Further work is required on rule selection and evaluation.

How is Machine Learning helpful?


There are specific use cases like the spam filter, where doing traditional programming is hard. Also, the real use of machine learning, that is, cognitive problems, such as image recognition, speech processing, Natural Language Processing (NLP), and so on. These tasks are extremely data-driven and complex, and solving them using rules would be a nightmare. So, an increase in complexity and data-driven problems are the key areas where machine learning can thrive. For example, we have NLP models that can write entire movie scripts, image processing models that can colorize old black and white images, and so on.

101 Machine Learning Algorithms for Data Science with Cheat Sheets


The algorithms have been sorted into 9 groups: Anomaly Detection, Association Rule Learning, Classification, Clustering, Dimensional Reduction, Ensemble, Neural Networks, Regression, Regularization. In this post, you'll find 101 machine learning algorithms, including useful infographics to help you know when to use each one (if available). Each of the accordian drop downs are embeddable if you want to take them with you. All you have to do is click the little'embed' button in the lower left hand corner and copy/paste the iframe. All we ask is you link back to this post.

Identifying the Leading Factors of Significant Weight Gains Using a New Rule Discovery Method Artificial Intelligence

Overweight and obesity remain a major global public health concern and identifying the individualized patterns that increase the risk of future weight gains has a crucial role in preventing obesity and numerous sub-sequent diseases associated with obesity. In this work, we use a rule discovery method to study this problem, by presenting an approach that offers genuine interpretability and concurrently optimizes the accuracy(being correct often) and support (applying to many samples) of the identified patterns. Specifically, we extend an established subgroup-discovery method to generate the desired rules of type X -> Y and show how top features can be extracted from the X side, functioning as the best predictors of Y. In our obesity problem, X refers to the extracted features from very large and multi-site EHR data, and Y indicates significant weight gains. Using our method, we also extensively compare the differences and inequities in patterns across 22 strata determined by the individual's gender, age, race, insurance type, neighborhood type, and income level. Through extensive series of experiments, we show new and complementary findings regarding the predictors of future dangerous weight gains.

Darwin: Adaptive Rule Discovery for Labeling Text Data


There is consensus, especially in our current deep-learning era, that more training data almost always helps improve performance of our deep learning models. But the process of collecting labeled data remains a costly and cumbersome task. Naturally, researchers started looking into this problem, which has led to development of various techniques for reducing the labeling cost. Among these, is a popular technique called weak supervision, in which a collection of heuristics and rules are used to label the data. Of course, the labels would be noisy but these weak labels have proven to be valuable as long as the rules have a reasonable error rate.

An Empirical Investigation into Deep and Shallow Rule Learning Artificial Intelligence

Inductive rule learning is arguably among the most traditional paradigms in machine learning. Although we have seen considerable progress over the years in learning rule-based theories, all state-of-the-art learners still learn descriptions that directly relate the input features to the target concept. In the simplest case, concept learning, this is a disjunctive normal form (DNF) description of the positive class. While it is clear that this is sufficient from a logical point of view because every logical expression can be reduced to an equivalent DNF expression, it could nevertheless be the case that more structured representations, which form deep theories by forming intermediate concepts, could be easier to learn, in very much the same way as deep neural networks are able to outperform shallow networks, even though the latter are also universal function approximators. In this paper, we empirically compare deep and shallow rule learning with a uniform general algorithm, which relies on greedy mini-batch based optimization. Our experiments on both artificial and real-world benchmark data indicate that deep rule networks outperform shallow networks.

An Investigation into Mini-Batch Rule Learning Artificial Intelligence

We investigate whether it is possible to learn rule sets efficiently in a network structure with a single hidden layer using iterative refinements over mini-batches of examples. A first rudimentary version shows an acceptable performance on all but one dataset, even though it does not yet reach the performance levels of Ripper.



Unsupervised learning is where only the input data is present and no corresponding output variable is there. Unsupervised learning has a lot of potential ranging anywhere from fraud detection to stock trading. Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior. Association: An association rule learning problem is where you want to discover rules that describe a large portion of your data. Association rules mining are used to identify new and interesting insights between different objects in a set, frequent pattern in transactional data or any sort of relational database.