Rule-Based Reasoning
Jazz beat short-handed Clips 114-96 for 9th straight win
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Donovan Mitchell scored 24 points, Rudy Gobert had 23 points and 20 rebounds, and the Utah Jazz rolled past the short-handed Los Angeles Clippers 114-96 on Wednesday night for their ninth consecutive victory. Jordan Clarkson scored 18 points for the NBA-leading Jazz, who improved to 24-5 with their 20th win in 21 games. After three tight quarters, Utah broke it open in the fourth to win this matchup of Western Conference powerhouses -- although it wasn't a proper showdown with the Clippers missing injured superstars Kawhi Leonard and Paul George.
FIXME: Enhance Software Reliability with Hybrid Approaches in Cloud
Hwang, Jinho, Shwartz, Larisa, Wang, Qing, Batta, Raghav, Kumar, Harshit, Nidd, Michael
With the promise of reliability in cloud, more enterprises are migrating to cloud. The process of continuous integration/deployment (CICD) in cloud connects developers who need to deliver value faster and more transparently with site reliability engineers (SREs) who need to manage applications reliably. SREs feed back development issues to developers, and developers commit fixes and trigger CICD to redeploy. The release cycle is more continuous than ever, thus the code to production is faster and more automated. To provide this higher level agility, the cloud platforms become more complex in the face of flexibility with deeper layers of virtualization. However, reliability does not come for free with all these complexities. Software engineers and SREs need to deal with wider information spectrum from virtualized layers. Therefore, providing correlated information with true positive evidences is critical to identify the root cause of issues quickly in order to reduce mean time to recover (MTTR), performance metrics for SREs. Similarity, knowledge, or statistics driven approaches have been effective, but with increasing data volume and types, an individual approach is limited to correlate semantic relations of different data sources. In this paper, we introduce FIXME to enhance software reliability with hybrid diagnosis approaches for enterprises. Our evaluation results show using hybrid diagnosis approach is about 17% better in precision. The results are helpful for both practitioners and researchers to develop hybrid diagnosis in the highly dynamic cloud environment.
Why Machine Learning For Machine Learning's Sake Is A Bad Idea?
Today, businesses are increasingly reliant on artificial intelligence and machine learning to solve critical problems. However, dealing with immense data complexities along with the pressure of having to provide rapid results, could be crippling. Most companies find building an ML-savvy framework quite overwhelming. In an engaging session at MLDS 2021, Sayanti Bhattacharya, Senior Manager, and Ashwin Pai, Manager at Ugam, a Merkle Company, addressed how businesses can apply machine learning to drive results. Machine learning has become such a fashion statement that, more often than not, businesses jump the gun by implementing ML in a hurry, defying logic.
A Scalable Two Stage Approach to Computing Optimal Decision Sets
Ignatiev, Alexey, Lam, Edward, Stuckey, Peter J., Marques-Silva, Joao
Machine learning (ML) is ubiquitous in modern life. Since it is being deployed in technologies that affect our privacy and safety, it is often crucial to understand the reasoning behind its decisions, warranting the need for explainable AI. Rule-based models, such as decision trees, decision lists, and decision sets, are conventionally deemed to be the most interpretable. Recent work uses propositional satisfiability (SAT) solving (and its optimization variants) to generate minimum-size decision sets. Motivated by limited practical scalability of these earlier methods, this paper proposes a novel approach to learn minimum-size decision sets by enumerating individual rules of the target decision set independently of each other, and then solving a set cover problem to select a subset of rules. The approach makes use of modern maximum satisfiability and integer linear programming technologies. Experiments on a wide range of publicly available datasets demonstrate the advantage of the new approach over the state of the art in SAT-based decision set learning.
Mining Feature Relationships in Data
When faced with a new dataset, most practitioners begin by performing exploratory data analysis to discover interesting patterns and characteristics within data. Techniques such as association rule mining are commonly applied to uncover relationships between features (attributes) of the data. However, association rules are primarily designed for use on binary or categorical data, due to their use of rule-based machine learning. A large proportion of real-world data is continuous in nature, and discretisation of such data leads to inaccurate and less informative association rules. In this paper, we propose an alternative approach called feature relationship mining (FRM), which uses a genetic programming approach to automatically discover symbolic relationships between continuous or categorical features in data. To the best of our knowledge, our proposed approach is the first such symbolic approach with the goal of explicitly discovering relationships between features. Empirical testing on a variety of real-world datasets shows the proposed method is able to find high-quality, simple feature relationships which can be easily interpreted and which provide clear and non-trivial insight into data.
Diagnosis of Acute Poisoning Using Explainable Artificial Intelligence
Chary, Michael, Boyer, Ed W, Burns, Michele M
Medical toxicology is the clinical specialty that treats the toxic effects of substances, be it an overdose, a medication error, or a scorpion sting. The volume of toxicological knowledge and research has, as with other medical specialties, outstripped the ability of the individual clinician to entirely master and stay current with it. The application of machine learning techniques to medical toxicology is challenging because initial treatment decisions are often based on a few pieces of textual data and rely heavily on prior knowledge. ML techniques often do not represent knowledge in a way that is transparent for the physician, raising barriers to usability. Rule-based systems and decision tree learning are more transparent approaches, but often generalize poorly and require expert curation to implement and maintain. Here, we construct a probabilistic logic network to represent a portion of the knowledge base of a medical toxicologist. Our approach transparently mimics the knowledge representation and clinical decision-making of practicing clinicians. The software, dubbed Tak, performs comparably to humans on straightforward cases and intermediate difficulty cases, but is outperformed by humans on challenging clinical cases. Tak outperforms a decision tree classifier at all levels of difficulty. Probabilistic logic provides one form of explainable artificial intelligence that may be more acceptable for use in healthcare, if it can achieve acceptable levels of performance.
Taxonomic survey of Hindi Language NLP systems
Desai, Nikita P., Prof., null, Dabhi, Vipul K.
The field of Natural language processing can be formally defined as - "A theoretically motivated range of computational techniques for analyzing and representing naturally occurring texts at one or more levels of linguistic analysis for the purpose of achieving human-like language processing for a range of tasks or applications"[69]. The naturally occurring text can be in written or spoken form.A wide array of domains contribute to NLP development like linguistics, computer science and psychology.The linguistics field helps to understand the formal structure of language while computer science domain helps to find efficient internal representations and data structures.The study of "Psychology" can be useful to understand the methodology used by humans for dealing with languages. NLP can be considered to be having two distinct focus namely (1)Natural Language Generation(NLG) and (2)Natural Language Understanding(NLU). The NLG deals with planning to use the representation of language to decide what should be generated at each point in interaction, while NLU needs to analyze language and decide which is best way to represent it meaningfully.We, in this survey paper, concentrate on area of NLU for written text.Hence the NLP henceforth might be considered as NLU and vice versa. Motivation for designing Indian NLP systems Hindi and English are the official languages in central government of India(GOI). Indian community faces a "Digital Divide" due to dominance of English as mode of communication in higher education, judiciary, corporate sector and Public administration at Central level whereas the government in states work in their respective regional languages [67].The expansion of Internet has inter-connected the socioeconomic environment of the world and redefined the concept of global culture.As per a report in 2017 by the companies kpmg and Google
Enhancing Sequence-to-Sequence Neural Lemmatization with External Resources
Milintsevich, Kirill, Sirts, Kairit
We propose a novel hybrid approach to lemmatization that enhances the seq2seq neural model with additional lemmas extracted from an external lexicon or a rule-based system. During training, the enhanced lemmatizer learns both to generate lemmas via a sequential decoder and copy the lemma characters from the external candidates supplied during run-time. Our lemmatizer enhanced with candidates extracted from the Apertium morphological analyzer achieves statistically significant improvements compared to baseline models not utilizing additional lemma information, achieves an average accuracy of 97.25% on a set of 23 UD languages, which is 0.55% higher than obtained with the Stanford Stanza model on the same set of languages. We also compare with other methods of integrating external data into lemmatization and show that our enhanced system performs considerably better than a simple lexicon extension method based on the Stanza system, and it achieves complementary improvements w.r.t. the data augmentation method.
Using Finite-State Machines to Automatically Scan Classical Greek Hexameter
Schumann, Anne-Kathrin, Beierle, Christoph, Blößner, Norbert
Greek literature has, for centuries, served as a paradigm and model for literary writing all over Europe. The oldest surviving texts of Classical Greek literature - texts such as the Iliad, the Odyssey, and the works of Hesiod - are epic poems that all share the same metre: hexameter. They are written in an artificial language that has never been spoken in everyday life and owes its origin and many of its peculiarities to the nature of metrically bound language (Meister (1921)). Comprehensive hexameter annotation is, therefore, crucial for large-scale and data-driven investigations into some of the linguistic features of Ancient Greek epic language. Furthermore, it may provide additional criteria for the evaluation of Homer's repeated verses, the so-called iterata. Within Classical Philology, controversy around the nature of the Homeric repetitions started in 1840, and it remained one of the central research questions in the field for a long period of time (see Strasser (1984), pp.
A Survey on the Explainability of Supervised Machine Learning
Burkart, Nadia (Fraunhofer IOSB) | Huber, Marco F. (Fraunhofer IPA, University of Stuttgart)
Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or finance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.