Goto

Collaborating Authors

 Association Learning


Causal Rule Learning: Enhancing the Understanding of Heterogeneous Treatment Effect via Weighted Causal Rules

Wu, Ying, Liu, Hanzhong, Ren, Kai, Chang, Xiangyu

arXiv.org Machine Learning

Interpretability is a key concern in estimating heterogeneous treatment effects using machine learning methods, especially for healthcare applications where high-stake decisions are often made. Inspired by the Predictive, Descriptive, Relevant framework of interpretability, we propose causal rule learning which finds a refined set of causal rules characterizing potential subgroups to estimate and enhance our understanding of heterogeneous treatment effects. Causal rule learning involves three phases: rule discovery, rule selection, and rule analysis. In the rule discovery phase, we utilize a causal forest to generate a pool of causal rules with corresponding subgroup average treatment effects. The selection phase then employs a D-learning method to select a subset of these rules to deconstruct individual-level treatment effects as a linear combination of the subgroup-level effects. This helps to answer an ignored question by previous literature: what if an individual simultaneously belongs to multiple groups with different average treatment effects? The rule analysis phase outlines a detailed procedure to further analyze each rule in the subset from multiple perspectives, revealing the most promising rules for further validation. The rules themselves, their corresponding subgroup treatment effects, and their weights in the linear combination give us more insights into heterogeneous treatment effects. Simulation and real-world data analysis demonstrate the superior performance of causal rule learning on the interpretable estimation of heterogeneous treatment effect when the ground truth is complex and the sample size is sufficient.


Logical Entity Representation in Knowledge-Graphs for Differentiable Rule Learning

Han, Chi, He, Qizheng, Yu, Charles, Du, Xinya, Tong, Hanghang, Ji, Heng

arXiv.org Artificial Intelligence

Probabilistic logical rule learning has shown great strength in logical rule mining and knowledge graph completion. It learns logical rules to predict missing edges by reasoning on existing edges in the knowledge graph. However, previous efforts have largely been limited to only modeling chain-like Horn clauses such as $R_1(x,z)\land R_2(z,y)\Rightarrow H(x,y)$. This formulation overlooks additional contextual information from neighboring sub-graphs of entity variables $x$, $y$ and $z$. Intuitively, there is a large gap here, as local sub-graphs have been found to provide important information for knowledge graph completion. Inspired by these observations, we propose Logical Entity RePresentation (LERP) to encode contextual information of entities in the knowledge graph. A LERP is designed as a vector of probabilistic logical functions on the entity's neighboring sub-graph. It is an interpretable representation while allowing for differentiable optimization. We can then incorporate LERP into probabilistic logical rule learning to learn more expressive rules. Empirical results demonstrate that with LERP, our model outperforms other rule learning methods in knowledge graph completion and is comparable or even superior to state-of-the-art black-box methods. Moreover, we find that our model can discover a more expressive family of logical rules. LERP can also be further combined with embedding learning methods like TransE to make it more interpretable.


Neuro-symbolic Rule Learning in Real-world Classification Tasks

Baugh, Kexin Gu, Cingillioglu, Nuri, Russo, Alessandra

arXiv.org Artificial Intelligence

Neuro-symbolic rule learning has attracted lots of attention as it offers better interpretability than pure neural models and scales better than symbolic rule learning. A recent approach named pix2rule proposes a neural Disjunctive Normal Form (neural DNF) module to learn symbolic rules with feed-forward layers. Although proved to be effective in synthetic binary classification, pix2rule has not been applied to more challenging tasks such as multi-label and multi-class classifications over real-world data. In this paper, we address this limitation by extending the neural DNF module to (i) support rule learning in real-world multi-class and multi-label classification tasks, (ii) enforce the symbolic property of mutual exclusivity (i.e. predicting exactly one class) in multi-class classification, and (iii) explore its scalability over large inputs and outputs. We train a vanilla neural DNF model similar to pix2rule's neural DNF module for multi-label classification, and we propose a novel extended model called neural DNF-EO (Exactly One) which enforces mutual exclusivity in multi-class classification. We evaluate the classification performance, scalability and interpretability of our neural DNF-based models, and compare them against pure neural models and a state-of-the-art symbolic rule learner named FastLAS. We demonstrate that our neural DNF-based models perform similarly to neural networks, but provide better interpretability by enabling the extraction of logical rules. Our models also scale well when the rule search space grows in size, in contrast to FastLAS, which fails to learn in multi-class classification tasks with 200 classes and in all multi-label settings.


Efficient learning of large sets of locally optimal classification rules

Huynh, Van Quoc Phuong, Fürnkranz, Johannes, Beck, Florian

arXiv.org Artificial Intelligence

Conventional rule learning algorithms aim at finding a set of simple rules, where each rule covers as many examples as possible. In this paper, we argue that the rules found in this way may not be the optimal explanations for each of the examples they cover. Instead, we propose an efficient algorithm that aims at finding the best rule covering each training example in a greedy optimization consisting of one specialization and one generalization loop. These locally optimal rules are collected and then filtered for a final rule set, which is much larger than the sets learned by conventional rule learning algorithms. A new example is classified by selecting the best among the rules that cover this example. In our experiments on small to very large datasets, the approach's average classification accuracy is higher than that of state-of-the-art rule learning algorithms. Moreover, the algorithm is highly efficient and can inherently be processed in parallel without affecting the learned rule set and so the classification accuracy. We thus believe that it closes an important gap for large-scale classification rule induction.


Machine Learning with Probabilistic Law Discovery: A Concise Introduction

Demin, Alexander, Ponomaryov, Denis

arXiv.org Artificial Intelligence

Probabilistic Law Discovery (PLD) is a logic based Machine Learning method, which implements a variant of probabilistic rule learning. In several aspects, PLD is close to Decision Tree/Random Forest methods, but it differs significantly in how relevant rules are defined. The learning procedure of PLD solves the optimization problem related to the search for rules (called probabilistic laws), which have a minimal length and relatively high probability. At inference, ensembles of these rules are used for prediction. Probabilistic laws are human-readable and PLD based models are transparent and inherently interpretable. Applications of PLD include classification/clusterization/regression tasks, as well as time series analysis/anomaly detection and adaptive (robotic) control. In this paper, we outline the main principles of PLD, highlight its benefits and limitations and provide some application guidelines.


association-rule-unsupervised-machine.html

#artificialintelligence

Artificial intelligence and machine learning are touching our everyday lives in more-and-more ways. There's an endless supply of industries and applications that machine learning can make more efficient and intelligent. This course introduces you to one of the prominent modelling families of Unsupervised Machine Learning called Association Rule Learning. Association rule mining helps find exciting connections and linkages among large data items. The association rule learning is employed in Market Basket analysis, Web usage mining, Continuous production, Customer analytics, Catalogue design, Shop layout, Recommender systems etc. Association rules are critical in data mining for analyzing and forecasting consumer behaviour.


101 Machine Learning Algorithms for Data Science with Cheat Sheets

#artificialintelligence

These 101 algorithms are equipped with cheat sheets, tutorials, and explanations. Think of this as the one-stop shop/dictionary/directory for machine learning algorithms. The algorithms have been sorted into 9 groups: Anomaly Detection, Association Rule Learning, Classification, Clustering, Dimensional Reduction, Ensemble, Neural Networks, Regression, Regularization. In this post, you'll find 101 machine learning algorithms with useful Python tutorials, R tutorials, and cheat sheets from Microsoft Azure ML, SAS, and Scikit-Learn to help you know when to use each one (if available). At Data Science Dojo, our mission is to make data science (machine learning in this case) available to everyone.


Mobley

AAAI Conferences

Reinforcement Learning has the limitation that problems become too large very quickly. Dividing the problem into a hierarchy of subtasks allows for a strategy of divide and conquer, which is what makes Hierarchical Reinforcement Learning (HRL) algorithms often more efficient at finding solutions quicker than more naive approaches. One of the biggest challenges with HRL is the construction of a hierarchy to be used by the algorithm. Hierarchies are often designed by a person using their own knowledge of the problem. We propose method for automatically discovering task hierarchies based on a data mining technique, Association Rule Learning (ARL).


Gow

AAAI Conferences

Feedback on player experience and behaviour can be invaluable to game designers, but there is need for specialised knowledge discovery tools to deal with high volume playtest data. We describe a study witha commercial third-person shooter, in which integrated player activity and experience data was captured and mined for design-relevant knowledge. We demonstrate that association rule learning and rule templates can be used to extractmeaningful rules relating player activity and experience during combat. We found that the number, type and quality of rules varies between experiences, and is affected by feature distributions. Further work is required on rule selection and evaluation.


101 Machine Learning Algorithms for Data Science with Cheat Sheets

#artificialintelligence

The algorithms have been sorted into 9 groups: Anomaly Detection, Association Rule Learning, Classification, Clustering, Dimensional Reduction, Ensemble, Neural Networks, Regression, Regularization. In this post, you'll find 101 machine learning algorithms, including useful infographics to help you know when to use each one (if available). Each of the accordian drop downs are embeddable if you want to take them with you. All you have to do is click the little'embed' button in the lower left hand corner and copy/paste the iframe. All we ask is you link back to this post.