Expert Systems
Extracting knowledge from knowledge graphs using Facebook Pytorch BigGraph.
Machine learning gives us the ability to train a model, which can convert data rows into labels in such a way that similar data rows are mapped to similar or the same label. For example, we are building SPAM filter for email messages. We have a lot of email messages, some of which are marked as SPAM and some as INBOX. We can build a model, which learns to identify the SPAM messages. The messages to be marked as SPAM will be in some way similar to those, which are already marked as SPAM. The concept of similarity is vitally important for machine learning. In the real world, the concept of similarity is very specific to the subject matter and it depends on our knowledge.
Hybrid Predictive Model: When an Interpretable Model Collaborates with a Black-box Model
Interpretable machine learning has become a strong competitor for traditional black-box models. However, the possible loss of the predictive performance for gaining interpretability is often inevitable, putting practitioners in a dilemma of choosing between high accuracy (black-box models) and interpretability (interpretable models). In this work, we propose a novel framework for building a Hybrid Predictive Model (HPM) that integrates an interpretable model with any black-box model to combine their strengths. The interpretable model substitutes the black-box model on a subset of data where the black-box is overkill or nearly overkill, gaining transparency at no or low cost of the predictive accuracy. We design a principled objective function that considers predictive accuracy, model interpretability, and model transparency (defined as the percentage of data processed by the interpretable substitute.) Under this framework, we propose two hybrid models, one substituting with association rules and the other with linear models, and we design customized training algorithms for both models. We test the hybrid models on structured data and text data where interpretable models collaborate with various state-of-the-art black-box models. Results show that hybrid models obtain an efficient trade-off between transparency and predictive performance, characterized by our proposed efficient frontiers.
Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges
Ashmore, Rob, Calinescu, Radu, Paterson, Colin
Machine learning has evolved into an enabling technology for a wide range of highly successful applications. The potential for this success to continue and accelerate has placed machine learning (ML) at the top of research, economic and political agendas. Such unprecedented interest is fuelled by a vision of ML applicability extending to healthcare, transportation, defence and other domains of great societal importance. Achieving this vision requires the use of ML in safety-critical applications that demand levels of assurance beyond those needed for current ML applications. Our paper provides a comprehensive survey of the state-of-the-art in the assurance of ML, i.e. in the generation of evidence that ML is sufficiently safe for its intended use. The survey covers the methods capable of providing such evidence at different stages of the machine learning lifecycle, i.e. of the complex, iterative process that starts with the collection of the data used to train an ML component for a system, and ends with the deployment of that component within the system. The paper begins with a systematic presentation of the ML lifecycle and its stages. We then define assurance desiderata for each stage, review existing methods that contribute to achieving these desiderata, and identify open challenges that require further research.
Survey on Evaluation Methods for Dialogue Systems
Deriu, Jan, Rodrigo, Alvaro, Otegi, Arantxa, Echegoyen, Guillermo, Rosset, Sophie, Agirre, Eneko, Cieliebak, Mark
In this paper we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost and time intensive. Thus, much work has been put into finding methods, which allow to reduce the involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented dialogue systems, conversational dialogue systems, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class.
AI Enabling Technologies: A Survey
Gadepally, Vijay, Goodwin, Justin, Kepner, Jeremy, Reuther, Albert, Reynolds, Hayley, Samsi, Siddharth, Su, Jonathan, Martinez, David
Artificial Intelligence (AI) has the opportunity to revolutionize the way the United States Department of Defense (DoD) and Intelligence Community (IC) address the challenges of evolving threats, data deluge, and rapid courses of action. Developing an end-to-end artificial intelligence system involves parallel development of different pieces that must work together in order to provide capabilities that can be used by decision makers, warfighters and analysts. These pieces include data collection, data conditioning, algorithms, computing, robust artificial intelligence, and human-machine teaming. While much of the popular press today surrounds advances in algorithms and computing, most modern AI systems leverage advances across numerous different fields. Further, while certain components may not be as visible to end-users as others, our experience has shown that each of these interrelated components play a major role in the success or failure of an AI system. This article is meant to highlight many of these technologies that are involved in an end-to-end AI system. The goal of this article is to provide readers with an overview of terminology, technical details and recent highlights from academia, industry and government. Where possible, we indicate relevant resources that can be used for further reading and understanding.
Interval Valued Trapezoidal Neutrosophic Set for Prioritization of Non-functional Requirements
This paper discusses the trapezoidal fuzzy number(TrFN); Interval-valued intuitionistic fuzzy number(IVIFN); neutrosophic set and its operational laws; and, trapezoidal neutrosophic set(TrNS) and its operational laws. Based on the combination of IVIFN and TrNS, an Interval Valued Trapezoidal Neutrosophic Set (IVTrNS) is proposed followed by its operational laws. The paper also presents the score and accuracy functions for the proposed Interval Valued Trapezoidal Neutrosophic Number (IVTrNN). Then, an interval valued trapezoidal neutrosophic weighted arithmetic averaging (IVTrNWAA) operator is introduced to combine the trapezoidal information which is neutrosophic and in the unit interval of real numbers. Finally, a method is developed to handle the problems in the multi attribute decision making(MADM) environment using IVTrNWAA operator followed by a numerical example of NFRs prioritization to illustrate the relevance of the developed method.
A knowledge-based intelligence system for control of dirt recognition process in the smart washing machines
Annabestani, Mohsen, Rowhanimanesh, Alireza, Rezaei, Akram, Avazpour, Ladan, Sheikhhasani, Fatemeh
In this paper, we propose an intelligence approach based on fuzzy logic to modeling human intelligence in washing clothes. At first, an intelligent feedback loop is designed for perception-based sensing of dirt inspired by human color understanding. Then, when color stains leak out of some colored clothes the human probabilistic decision making is computationally modeled to detect this stain leakage and thus the problem of recognizing dirt from stain can be considered in the washing process. Finally, we discuss the fuzzy control of washing clothes and design and simulate a smart controller based on the fuzzy intelligence feedback loop.
Text Embeddings for Retrieval From a Large Knowledge Base
Cakaloglu, Tolgahan, Szegedy, Christian, Xu, Xiaowei
Text embedding representing natural language documents in a semantic vector space can be used for document retrieval using nearest neighbor lookup. In order to study the feasibility of neural models specialized for retrieval in a semantically meaningful way, we suggest the use of the Stanford Question Answering Dataset (SQuAD) in an open-domain question answering context, where the first task is to find paragraphs useful for answering a given question. First, we compare the quality of various text-embedding methods on the performance of retrieval and give an extensive empirical comparison on the performance of various non-augmented base embedding with, and without IDF weighting. Our main results are that by training deep residual neural models, specifically for retrieval purposes, can yield significant gains when it is used to augment existing embeddings. We also establish that deeper models are superior to this task. The best base baseline embeddings augmented by our learned neural approach improves the top-1 paragraph recall of the system by 14%.
KALM: A Rule-based Approach for Knowledge Authoring and Question Answering
Knowledge representation and reasoning (KRR) is one of the key areas in artificial intelligence (AI) field. It is intended to represent the world knowledge in formal languages (e.g., Prolog, SPARQL) and then enhance the expert systems to perform querying and inference tasks. Currently, constructing large scale knowledge bases (KBs) with high quality is prohibited by the fact that the construction process requires many qualified knowledge engineers who not only understand the domain-specific knowledge but also have sufficient skills in knowledge representation. Unfortunately, qualified knowledge engineers are in short supply. Therefore, it would be very useful to build a tool that allows the user to construct and query the KB simply via text. Although there is a number of systems developed for knowledge extraction and question answering, they mainly fail in that these system don't achieve high enough accuracy whereas KRR is highly sensitive to erroneous data. In this thesis proposal, I will present Knowledge Authoring Logic Machine (KALM), a rule-based system which allows the user to author knowledge and query the KB in text. The experimental results show that KALM achieved superior accuracy in knowledge authoring and question answering as compared to the state-of-the-art systems.
Investigating Robustness and Interpretability of Link Prediction via Adversarial Modifications
Pezeshkpour, Pouya, Tian, Yifan, Singh, Sameer
Representing entities and relations in an embedding space is a well-studied approach for machine learning on relational data. Existing approaches, however, primarily focus on improving accuracy and overlook other aspects such as robustness and interpretability. In this paper, we propose adversarial modifications for link prediction models: identifying the fact to add into or remove from the knowledge graph that changes the prediction for a target fact after the model is retrained. Using these single modifications of the graph, we identify the most influential fact for a predicted link and evaluate the sensitivity of the model to the addition of fake facts. We introduce an efficient approach to estimate the effect of such modifications by approximating the change in the embeddings when the knowledge graph changes. To avoid the combinatorial search over all possible facts, we train a network to decode embeddings to their corresponding graph components, allowing the use of gradient-based optimization to identify the adversarial modification. We use these techniques to evaluate the robustness of link prediction models (by measuring sensitivity to additional facts), study interpretability through the facts most responsible for predictions (by identifying the most influential neighbors), and detect incorrect facts in the knowledge base.