Rule-Based Reasoning
On Evaluating the Quality of Rule-Based Classification Systems
Two indicators are classically used to evaluate the quality of rule-based classification systems: predictive accuracy, i.e. the system's ability to successfully reproduce learning data and coverage, i.e. the proportion of possible cases for which the logical rules constituting the system apply. In this work, we claim that these two indicators may be insufficient, and additional measures of quality may need to be developed. We theoretically show that classification systems presenting "good" predictive accuracy and coverage can, nonetheless, be trivially improved and illustrate this proposition with examples. To conceptualize our main claim, we characterize a property of reducibility. A classification system is said to be reducible, if and only if, its constituent rules can be replaced by a subset of their elementary conditions, while preserving the quality of the system. We derive a time-efficient constructive algorithm to test this property and to improve a system's predictive accuracy and coverage in case of a positive response. Furthermore, we provide a set of sufficient conditions that can be used to detect non-reducibility and thus validate rule-based classification systems. We use the proposed approach to evaluate a previously published work applied to a public dataset pertaining to the business bankruptcy prediction, using three popular machine learning approaches (namely Genetic Algorithms, Inductive learning and Neural Networks).
A rigorous method to compare interpretability of rule-based algorithms
Interpretability is becoming increasingly important in predictive model analysis. Unfortunately, as mentioned by many authors, there is still no consensus on that idea. The aim of this article is to propose a rigorous mathematical definition of the concept of interpretability, allowing fair comparisons between any rule-based algorithms. This definition is built from three notions, each of which being quantitatively measured by a simple formula: predictivity, stability and simplicity. While predictivity has been widely studied to measure the accuracy of predictive algorithms, stability is based on the Dice-Sorensen index to compare two sets of rules generated by an algorithm using two independent samples. Simplicity is based on the sum of the length of the rules deriving from the generated model. The final objective measure of the interpretability of any rule-based algorithm ends up as a weighted sum of the three aforementioned concepts. This paper concludes with the comparison of the interpretability between four rule-based algorithms.
Discovering associations in COVID-19 related research papers
Fister, Iztok Jr., Fister, Karin, Fister, Iztok
A COVID-19 pandemic has already proven itself to be a global challenge. It proves how vulnerable humanity can be. It has also mobilized researchers from different sciences and different countries in the search for a way to fight this potentially fatal disease. In line with this, our study analyses the abstracts of papers related to COVID-19 and coronavirus-related-research using association rule text mining in order to find the most interestingness words, on the one hand, and relationships between them on the other. Then, a method, called information cartography, was applied for extracting structured knowledge from a huge amount of association rules. On the basis of these methods, the purpose of our study was to show how researchers have responded in similar epidemic/pandemic situations throughout history.
Fraud Detection with Machine Learning Versus the Most Common Threats
Machine Learning and Artificial Intelligence are offering an entirely new level of possibilities to businesses worldwide, one of those possibilities is Fraud Detection. Financial institutions and banks will never be the same with the opportunities technology offers to deal with criminal activities and fight internet fraud. Learn how it works in this post! The things people used to buy at shops years ago are now purchased online, no matter what they are: furniture, food, or clothes. As a result, the global E-Commerce market is rapidly rising and estimated to reach $4.9 trillion by 2021. This undoubtedly triggers members of the criminal world to find paths to victims' wallets through the Web. Federal, local, and state law enforcement agencies along with private organizations reported 3 million cases of identity theft in 2019. Money was lost in about 25% of these cases.
Top 12 AI Use Cases: Artificial Intelligence in FinTech
From automating the most menial and repetitive tasks to free up the time to focus on higher level objectives, to assisting with customer service management and reducing the risk of frauds, AI is employed from back-office tasks to the frontend with nimbleness and agility. According to the Alan Turing Institute, with $70 billion USD spent by banks on compliance each year just in the U.S., the amount of money spent on fraud is staggering. And when the number of reported cases of payments-related fraud has increased by 66% between 2015 and 2016 in the United Kingdom, it's clear how this problem is much more than a momentary phenomenon. AI is a groundbreaking technology in the battle against financial fraud. ML algorithms are able to analyze millions of data points in a matter of seconds to identify anomalous transactional patterns.
Generation of Consistent Sets of Multi-Label Classification Rules with a Multi-Objective Evolutionary Algorithm
Miranda, Thiago Zafalon, Sardinha, Diorge Brognara, Basgalupp, Márcio Porto, Jin, Yaochu, Cerri, Ricardo
Multi-label classification consists in classifying an instance into two or more classes simultaneously. It is a very challenging task present in many real-world applications, such as classification of biology, image, video, audio, and text. Recently, the interest in interpretable classification models has grown, partially as a consequence of regulations such as the General Data Protection Regulation. In this context, we propose a multi-objective evolutionary algorithm that generates multiple rule-based multi-label classification models, allowing users to choose among models that offer different compromises between predictive power and interpretability. An important contribution of this work is that different from most algorithms, which usually generate models based on lists (ordered collections) of rules, our algorithm generates models based on sets (unordered collections) of rules, increasing interpretability. Also, by employing a conflict avoidance algorithm during the rule-creation, every rule within a given model is guaranteed to be consistent with every other rule in the same model. Thus, no conflict resolution strategy is required, evolving simpler models. We conducted experiments on synthetic and real-world datasets and compared our results with state-of-the-art algorithms in terms of predictive performance (F-Score) and interpretability (model size), and demonstrate that our best models had comparable F-Score and smaller model sizes.
Artificial intelligence for fraud detection is bound to save billions
Fraud mitigation is one of the most sought-after artificial intelligence (AI) services because it can provide an immediate return on investment. Already, many companies are experiencing lucrative profits thanks to AI and machine learning (ML) systems that detect and prevent fraud in real-time. According to a new report, Highmark Inc.'s Financial Investigations and Provider Review (FIPR) department generated $260 million in savings that would have otherwise been lost to fraud, waste, and abuse in 2019. In the last five years, the company saved $850 million. "We know the overwhelming majority of providers do the right thing. But we also know year after year millions of health care dollars are lost to fraud, waste and abuse," said Melissa Anderson, executive vice president and chief audit and compliance officer, Highmark Health.
What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization
Belth, Caleb, Zheng, Xinyi, Vreeken, Jilles, Koutra, Danai
Knowledge graphs (KGs) store highly heterogeneous information about the world in the structure of a graph, and are useful for tasks such as question answering and reasoning. However, they often contain errors and are missing information. Vibrant research in KG refinement has worked to resolve these issues, tailoring techniques to either detect specific types of errors or complete a KG. In this work, we introduce a unified solution to KG characterization by formulating the problem as unsupervised KG summarization with a set of inductive, soft rules, which describe what is normal in a KG, and thus can be used to identify what is abnormal, whether it be strange or missing. Unlike first-order logic rules, our rules are labeled, rooted graphs, i.e., patterns that describe the expected neighborhood around a (seen or unseen) node, based on its type, and information in the KG. Stepping away from the traditional support/confidence-based rule mining techniques, we propose KGist, Knowledge Graph Inductive SummarizaTion, which learns a summary of inductive rules that best compress the KG according to the Minimum Description Length principle---a formulation that we are the first to use in the context of KG rule mining. We apply our rules to three large KGs (NELL, DBpedia, and Yago), and tasks such as compression, various types of error detection, and identification of incomplete information. We show that KGist outperforms task-specific, supervised and unsupervised baselines in error detection and incompleteness identification, (identifying the location of up to 93% of missing entities---over 10% more than baselines), while also being efficient for large knowledge graphs.
MyTradingPet - Trading Robot ZEO
Calculation of Return: Return is calculated based on dynamic position size, where position size of a new trade is determined in the following way: Stop loss value is always 1% of account balance. For example, if your account balance is at $1000, you will risk losing exactly $10 for the next trade. Strategies: There are two categories of strategies. One uses rule-based system which covers both trend and range based strategies. The other uses AI Machine Learning which has 3 models: AI-I, AI-II, AI-III.
The Emerging Landscape of AI Decision-Making
Artificial Intelligence (AI) has gained favor as the current buzzword for things related to technology in the popular press. Every day, we see news articles proclaiming that AI has solved some or the other problem in diverse fields. However, a natural question remains unanswered: What exactly is AI? Intelligence is normally associated with a capacity to learn from experience and making decisions based on the learned knowledge. It may also involve understanding complex ideas. But the most important aspect of intelligence is the capacity to apply the experience gained in one context to solve problems in a completely different context. A child quickly learns not to go near a situation that may be detrimental, for example, an open fire or a body of water.