Goto

Collaborating Authors

 Rule-Based Reasoning


Multi-value Rule Sets for Interpretable Classification with Feature-Efficient Representations

Neural Information Processing Systems

We present the Multi-value Rule Set (MRS) for interpretable classification with feature efficient presentations. Compared to rule sets built from single-value rules, MRS adopts a more generalized form of association rules that allows multiple values in a condition. Rules of this form are more concise than classical single-value rules in capturing and describing patterns in data. Our formulation also pursues a higher efficiency of feature utilization, which reduces possible cost in data collection and storage. We propose a Bayesian framework for formulating an MRS model and develop an efficient inference method for learning a maximum a posteriori, incorporating theoretically grounded bounds to iteratively reduce the search space and improve the search efficiency.


Boolean Decision Rules via Column Generation

Neural Information Processing Systems

This paper considers the learning of Boolean rules in either disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) or conjunctive normal form (CNF, AND-of-ORs) as an interpretable model for classification. An integer program is formulated to optimally trade classification accuracy for rule simplicity. Column generation (CG) is used to efficiently search over an exponential number of candidate clauses (conjunctions or disjunctions) without the need for heuristic rule mining. This approach also bounds the gap between the selected rule set and the best possible rule set on the training data. To handle large datasets, we propose an approximate CG algorithm using randomization.


Lifted Inference Rules With Constraints

Neural Information Processing Systems

Lifted inference rules exploit symmetries for fast reasoning in statistical rela-tional models. Computational complexity of these rules is highly dependent onthe choice of the constraint language they operate on and therefore coming upwith the right kind of representation is critical to the success of lifted inference.In this paper, we propose a new constraint language, called setineq, which allowssubset, equality and inequality constraints, to represent substitutions over the vari-ables in the theory. Our constraint formulation is strictly more expressive thanexisting representations, yet easy to operate on. We reformulate the three mainlifting rules: decomposer, generalized binomial and the recently proposed singleoccurrence for MAP inference, to work with our constraint representation. Exper-iments on benchmark MLNs for exact and sampling based inference demonstratethe effectiveness of our approach over several other existing techniques.


ARMS: Automated rules management system for fraud detection

arXiv.org Artificial Intelligence

Fraud detection is essential in financial services, with the potential of greatly reducing criminal activities and saving considerable resources for businesses and customers. We address online fraud detection, which consists of classifying incoming transactions as either legitimate or fraudulent in real-time. Modern fraud detection systems consist of a machine learning model and rules defined by human experts. Often, the rules performance degrades over time due to concept drift, especially of adversarial nature. Furthermore, they can be costly to maintain, either because they are computationally expensive or because they send transactions for manual review. We propose ARMS, an automated rules management system that evaluates the contribution of individual rules and optimizes the set of active rules using heuristic search and a user-defined loss-function. It complies with critical domain-specific requirements, such as handling different actions (e.g., accept, alert, and decline), priorities, blacklists, and large datasets (i.e., hundreds of rules and millions of transactions). We use ARMS to optimize the rule-based systems of two real-world clients. Results show that it can maintain the original systems' performance (e.g., recall, or false-positive rate) using only a fraction of the original rules (~ 50% in one case, and ~ 20% in the other).


Why Neuro-Symbolic Artificial Intelligence Is The A.I. Of The Future Digital Trends

#artificialintelligence

On the tray is an assortment of shapes: Some cubes, others spheres. The shapes are made from a variety of different materials and represent an assortment of sizes. In total there are, perhaps, eight objects. My question: "Looking at the objects, are there an equal number of large things and metal spheres?" The fact that it sounds as if it is is proof positive of just how simple it actually is.


Transformers as Soft Reasoners over Language

arXiv.org Artificial Intelligence

AI has long pursued the goal of having systems reason over *explicitly provided* knowledge, but building suitable representations has proved challenging. Here we explore whether transformers can similarly learn to reason (or emulate reasoning), but using rules expressed in language, thus bypassing a formal representation. We provide the first demonstration that this is possible, and characterize the extent of this capability. To do this, we use a collection of synthetic datasets that test increasing levels of reasoning complexity (number of rules, presence of negation, and depth of chaining). We find transformers appear to learn rule-based reasoning with high (99%) accuracy on these datasets, and in a way that generalizes to test data requiring substantially deeper chaining than in the training data (95%+ scores). We also demonstrate that the models transfer well to two hand-authored rulebases, and to rulebases paraphrased into more natural language. These findings are significant as it suggests a new role for transformers, namely as a limited "soft theorem prover" operating over explicit theories in language. This in turn suggests new possibilities for explainability, correctability, and counterfactual reasoning in question-answering. All datasets and a live demo are available at http://rule-reasoning.apps.allenai.org/


Dump the Spreadsheet. Leveraging AI for Automated Transaction Matching

#artificialintelligence

As digital transformation sweeps across the enterprise landscape, F&A processes continue to evolve. The decades-old manual process of entering data into a spreadsheet for reconciliation purposes has given way to digital reconciliation, with the advent of automation technology to make it faster and more efficient. However, even automated processes have evolved in the last few years with the advances made in machine learning and AI. What do these advances mean for F&A teams today?To illustrate the profound implications of AI and machine learning for F&A, consider the evolution of the transaction matching process in reconciliation. From the earliest days, F&A departments have largely relied on manual processes to reconcile accounts.


What do AML-BSA-CTF Regulators think of Machine Learning?

#artificialintelligence

Prior to 2018, regulators resisted recommending the use of Machine Learning (ML) based Artificial Intelligence (AI) for AML compliance. There was a mindset shift in mid 2018 indicating that proceeding with caution in implementing AI approaches for AML is appropriate. Regulators realize the adoption of recent innovation, such as the use of AI-ML and robotic process automation (RPA) techniques, enables AML compliance improvements not otherwise attainable. A risk-based approach to compliance, underpinned by AI/Machine Learning, creates opportunities for governance and process refinement as well as identifying potential untapped revenues. Reliance on box-ticking approaches familiar to users of legacy rules-based compliance systems is no longer sufficient.


Data Vision: Learning to See Through Algorithmic Abstraction

arXiv.org Machine Learning

Learning to see through data is central to contemporary forms of algorithmic knowledge production. While often represented as a mechanical application of rules, making algorithms work with data requires a great deal of situated work. This paper examines how the often-divergent demands of mechanization and discretion manifest in data analytic learning environments. Drawing on research in CSCW and the social sciences, and ethnographic fieldwork in two data learning environments, we show how an algorithm's application is seen sometimes as a mechanical sequence of rules and at other times as an array of situated decisions. Casting data analytics as a rule-based (rather than rule-bound) practice, we show that effective data vision requires would-be analysts to straddle the competing demands of formal abstraction and empirical contingency. We conclude by discussing how the notion of data vision can help better leverage the role of human work in data analytic learning, research, and practice.


Knowledge Graph Embedding for Link Prediction: A Comparative Analysis

arXiv.org Machine Learning

Knowledge Graphs (KGs) have found many applications in industry and academic settings, which in turn, have motivated considerable research efforts towards large-scale information extraction from a variety of sources. Despite such efforts, it is well known that even state-of-the-art KGs suffer from incompleteness. Link Prediction (LP), the task of predicting missing facts among entities already a KG, is a promising and widely studied task aimed at addressing KG incompleteness. Among the recent LP techniques, those based on KG embeddings have achieved very promising performances in some benchmarks. Despite the fast growing literature in the subject, insufficient attention has been paid to the effect of the various design choices in those methods. Moreover, the standard practice in this area is to report accuracy by aggregating over a large number of test facts in which some entities are over-represented; this allows LP methods to exhibit good performance by just attending to structural properties that include such entities, while ignoring the remaining majority of the KG. This analysis provides a comprehensive comparison of embedding-based LP methods, extending the dimensions of analysis beyond what is commonly available in the literature. We experimentally compare effectiveness and efficiency of 16 state-of-the-art methods, consider a rule-based baseline, and report detailed analysis over the most popular benchmarks in the literature.