Goto

Collaborating Authors

 Rule-Based Reasoning


The Growing Influence of AI in Smart Manufacturing

#artificialintelligence

The influence of Artificial Intelligence (AI) in smart manufacturing is growing rapidly. Artificial Intelligence, according to the ARC Advisory Group, applies to any device that perceives its environment and takes actions that maximize its chance of success toward some goal. This includes a vast range of technologies, such as traditional logic and rules-based systems, that enable computers to solve problems in ways that at least superficially resemble thinking. According to a recent Accenture Artificial intelligence (AI) research report, corporate profits will increase by an average of 38% by 2035 in large part thanks to a more advanced deployment of Artificial Intelligence into financial, IT and manufacturing applications. But at this early stage of AI implementation, is it still not clear how it will be deployed across many possible use cases.


Local Rule-Based Explanations of Black Box Decision Systems

arXiv.org Artificial Intelligence

The recent years have witnessed the rise of accurate but obscure decision systems which hide the logic of their internal decision processes to the users. The lack of explanations for the decisions of black box systems is a key ethical issue, and a limitation to the adoption of machine learning components in socially sensitive and safety-critical contexts. In this paper we focus on the problem of black box outcome explanation, i.e., explaining the reasons of the decision taken on a specific instance. We propose LORE, an agnostic method able to provide interpretable and faithful explanations. LORE first leans a local interpretable predictor on a synthetic neighborhood generated by a genetic algorithm. Then it derives from the logic of the local interpretable predictor a meaningful explanation consisting of: a decision rule, which explains the reasons of the decision; and a set of counterfactual rules, suggesting the changes in the instance's features that lead to a different outcome. Wide experiments show that LORE outperforms existing methods and baselines both in the quality of explanations and in the accuracy in mimicking the black box.


When Data Science Alone Won't Cut it - Dataconomy

#artificialintelligence

I recently read an article (paywall) in the WSJ about Paul Allen's Vulcan initiative to curb illegal fishing. It's insightful and sheds light on Big Data techniques to address societal problems. After thinking on the story, it struck me that it could be used as a pedagogical tool to synthesize data science with domain knowledge. To me, this stands as the biggest limitation of what I refer to as'data science thinking'โ€“ letting technical skills drive the analysis, only later incorporating domain understanding. This post somewhat reads like a case note from business school and the idea is to get data scientists, product managers and engineers talking earlier on in the process.


Boolean Decision Rules via Column Generation

arXiv.org Artificial Intelligence

This paper considers the learning of Boolean rules in either disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) or conjunctive normal form (CNF, AND-of-ORs) as an interpretable model for classification. An integer program is formulated to optimally trade classification accuracy for rule simplicity. Column generation (CG) is used to efficiently search over an exponential number of candidate clauses (conjunctions or disjunctions) without the need for heuristic rule mining. This approach also bounds the gap between the selected rule set and the best possible rule set on the training data. To handle large datasets, we propose an approximate CG algorithm using randomization. Compared to three recently proposed alternatives, the CG algorithm dominates the accuracy-simplicity trade-off in 7 out of 15 datasets. When maximized for accuracy, CG is competitive with rule learners designed for this purpose, sometimes finding significantly simpler solutions that are no less accurate.


Machine Learning Use Cases in Financial Crimes

#artificialintelligence

Unlike rules-based systems, which are fairly easy for fraudsters to test and circumvent, machine learning adapts to changing behaviors in a population through automated model building. With every iteration, the algorithms get smarter and more accurately find activities that represent risk to the firm. It's easy to see the value of machine learning for keeping pace with evolving fraud tactics. Learn 10 proven ways machine learning can boost the efficiency and effectiveness of fraud and financial crimes teams โ€“ from data collection to detection to investigation and reporting.


r/MachineLearning - [D] How do you study from textbooks?

#artificialintelligence

I am by no means a particularly good example of study habits, but generally I tend to read what I need and go from there... Basically this in practice often means starting somewhere relevant to whatever work/assignment/project I'm trying to do, and then going backwards building a recursive stack of readings that seem important to understanding the previous thing until I reach a point where I am familiar with the material already. Then I work through the stack until I'm back to wherever I started. Essentially this is the backward chaining algorithm. I also, if I need to learn a lot from a book for some reason (i.e. a course) or have no particular goal in mind but find my self with a text that piques my interest, then I tend to skim from cover to cover everything that actually attracts my attention, occasionally flipping back to something that I realize is important for understanding later stuff. If it seems especially critical and I can't understand it, then I'll look through exercises and maybe do them if it seems worthwhile.


Machine Learning without the Hype

#artificialintelligence

What is artificial intelligence, machine learning, and deep learning mean in general? When is a rule-based approach the right solution and when do you need machine learning? What does machine learning mean for time-series data? What is the difference between supervised and unsupervised learning in this area? Thanks to Devoxx for giving us permission to post this talk.


Intrinsic dimension and its application to association rules

arXiv.org Artificial Intelligence

The curse of dimensionality in the realm of association rules is twofold. Firstly, we have the well known exponential increase in computational complexity with increasing item set size. Secondly, there is a \emph{related curse} concerned with the distribution of (spare) data itself in high dimension. The former problem is often coped with by projection, i.e., feature selection, whereas the best known strategy for the latter is avoidance. This work summarizes the first attempt to provide a computationally feasible method for measuring the extent of dimension curse present in a data set with respect to a particular class machine of learning procedures. This recent development enables the application of various other methods from geometric analysis to be investigated and applied in machine learning procedures in the presence of high dimension.


Machine Learning: Practical Applications for Cybersecurity

#artificialintelligence

If you've walked around any security conferences recently, you'll have heard dozens of vendors talking about artificial intelligence (AI) and machine learning. But what do they actually do? Are they really going to usher in a grand age of cybersecurity? And are security analysts about to be collectively out of a job? Recently, Recorded Future co-hosted a webinar with SANS Institute with the goal of helping security conscious organizations understand how machine learning can help them process an almost infinite number of inputs into a small number of actionable outputs.


Scaling associative classification for very large datasets

arXiv.org Artificial Intelligence

Supervised learning algorithms are nowadays successfully scaling up to datasets that are very large in volume, leveraging the potential of in-memory cluster-computing Big Data frameworks. Still, massive datasets with a number of large-domain categorical features are a difficult challenge for any classifier. Most off-the-shelf solutions cannot cope with this problem. In this work we introduce DAC, a Distributed Associative Classifier. DAC exploits ensemble learning to distribute the training of an associative classifier among parallel workers and improve the final quality of the model. Furthermore, it adopts several novel techniques to reach high scalability without sacrificing quality, among which a preventive pruning of classification rules in the extraction phase based on Gini impurity. We ran experiments on Apache Spark, on a real large-scale dataset with more than 4 billion records and 800 million distinct categories. The results showed that DAC improves on a state-of-the-art solution in both prediction quality and execution time. Since the generated model is human-readable, it can not only classify new records, but also allow understanding both the logic behind the prediction and the properties of the model, becoming a useful aid for decision makers.