Goto

Collaborating Authors

 Expert Systems


Interactively Generating Explanations for Transformer Language Models

arXiv.org Artificial Intelligence

Transformer language models are state-of-the-art in a multitude of NLP tasks. Despite these successes, their opaqueness remains problematic. Recent methods aiming to provide interpretability and explainability to black-box models primarily focus on post-hoc explanations of (sometimes spurious) input-output correlations. Instead, we emphasize using prototype networks directly incorporated into the model architecture and hence explain the reasoning process behind the network's decisions. Moreover, while our architecture performs on par with several language models, it enables one to learn from user interactions. This not only offers a better understanding of language models but uses human capabilities to incorporate knowledge outside of the rigid range of purely data-driven approaches.


Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations

arXiv.org Artificial Intelligence

Learning good representations on multi-relational graphs is essential to knowledge base completion (KBC). In this paper, we propose a new self-supervised training objective for multi-relational graph representation learning, via simply incorporating relation prediction into the commonly used 1vsAll objective. The new training objective contains not only terms for predicting the subject and object of a given triple, but also a term for predicting the relation type. We analyse how this new objective impacts multi-relational learning in KBC: experiments on a variety of datasets and models show that relation prediction can significantly improve entity ranking, the most widely used evaluation task for KBC, yielding a 6.1% increase in MRR and 9.9% increase in Hits@1 on FB15k-237 as well as a 3.1% increase in MRR and 3.4% in Hits@1 on Aristo-v4. Moreover, we observe that the proposed objective is especially effective on highly multi-relational datasets, i.e. datasets with a large number of predicates, and generates better representations when larger embedding sizes are used.


AI in Software Automation Process - Time Bulletin

#artificialintelligence

Artificial Intelligence (AI) is when a machine imitates the cognitive functions that humans associate with other human minds, such as learning and problem solving, reasoning, knowledge representation, social intelligence, and general intelligence in terms of computer systems. It is an emerging field and vital applications include machine learning, expert systems, natural language processing, speech recognition, machine vision and neural semantic systems. Approaches include statistical methods, computational intelligence, soft computing, and orthodox symbolic AI. One word that best describes use of artificial intelligence is Automation or digitalization. Automating process involves employing AI platforms that can support the digitalization process and deliver the same or better results that human brain would have achieved.


Tying quantum computing to AI prompts a smarter power grid

#artificialintelligence

Fumbling to find flashlights during blackouts may soon be a distant memory, as quantum computing and artificial intelligence could learn to decipher an electric grid's problematic quirks and solve system hiccups so fast, humans may not notice. Rather than energy grid faults turning into giant problems--such as voltage variations or widespread blackouts--blazing fast computation blended with artificial intelligence could rapidly diagnose trouble and find solutions in tiny splits of seconds, according to Cornell research forthcoming in Applied Energy (Dec. 1, 2021). "Energy power system failures are an old problem and we are still using classic computational methods to resolve them," said Fengqi You, the Roxanne E. and Michael J. Zak Professor in Energy Systems Engineering in the College of Engineering. "Today's power systems can benefit from AI and the computational power of quantum computing, so power systems can be stable and reliable." You, along with doctoral student Akshay Ajagekar, are co-authors of "Quantum Computing-based Hybrid Deep Learning for Fault Diagnosis in Electrical Power Systems."


Tying quantum computing to AI prompts smarter power grid

#artificialintelligence

Fumbling to find flashlights during blackouts may soon be a distant memory, as quantum computing and artificial intelligence could learn to decipher an electric grid's problematic quirks and solve system hiccups so fast, humans may not notice. Rather than energy grid faults turning into giant problems – such as voltage variations or widespread blackouts – blazing fast computation blended with artificial intelligence could rapidly diagnose trouble and find solutions in tiny splits of seconds, according to Cornell research forthcoming in Applied Energy (Dec. 1, 2021). "Energy power system failures are an old problem and we are still using classic computational methods to resolve them," said Fengqi You, the Roxanne E. and Michael J. Zak Professor in Energy Systems Engineering in the College of Engineering. "Today's power systems can benefit from AI and the computational power of quantum computing, so power systems can be stable and reliable." You, along with doctoral student Akshay Ajagekar, are co-authors of "Quantum Computing-based Hybrid Deep Learning for Fault Diagnosis in Electrical Power Systems."


A multi-stage semi-supervised improved deep embedded clustering (MS-SSIDEC) method for bearing fault diagnosis under the situation of insufficient labeled samples

arXiv.org Artificial Intelligence

Intelligent data-driven fault diagnosis methods have been widely applied, but most of these methods need a large number of high-quality labeled samples. It costs a lot of labor and time to label data in actual industrial processes, which challenges the application of intelligent fault diagnosis methods. To solve this problem, a multi-stage semi-supervised improved deep embedded clustering (MS-SSIDEC) method is proposed for the bearing fault diagnosis under the insufficient labeled samples situation. This method includes three stages: pre-training, deep clustering and enhanced supervised learning. In the first stage, a skip-connection based convolutional auto-encoder (SCCAE) is proposed and pre-trained to automatically learn low-dimensional representations. In the second stage, a semi-supervised improved deep embedded clustering (SSIDEC) model that integrates the pre-trained auto-encoder with a clustering layer is proposed for deep clustering. Additionally, virtual adversarial training (VAT) is introduced as a regularization term to overcome the overfitting in the model's training. In the third stage, high-quality clustering results obtained in the second stage are assigned to unlabeled samples as pseudo labels. The labeled dataset is augmented by those pseudo-labeled samples and used to train a bearing fault discriminative model. The effectiveness of the method is evaluated on the Case Western Reserve University (CWRU) bearing dataset. The results show that the method can not only satisfy the semi-supervised learning under a small number of labeled samples, but also solve the problem of unsupervised learning, and has achieved better results than traditional diagnosis methods. This method provides a new research idea for fault diagnosis with limited labeled samples by effectively using unsupervised data.


Data-driven Residual Generation for Early Fault Detection with Limited Data

arXiv.org Artificial Intelligence

Traditionally, fault detection and isolation community has used system dynamic equations to generate diagnosers and to analyze detectability and isolability of the dynamic systems. Model-based fault detection and isolation methods use system model to generate a set of residuals as the bases for fault detection and isolation. However, in many complex systems it is not feasible to develop highly accurate models for the systems and to keep the models updated during the system lifetime. Recently, data-driven solutions have received an immense attention in the industries systems for several practical reasons. First, these methods do not require the initial investment and expertise for developing accurate models. Moreover, it is possible to automatically update and retrain the diagnosers as the system or the environment change over time. Finally, unlike the model-based methods it is straight forward to combine time series measurements such as pressure and voltage with other sources of information such as system operating hours to achieve a higher accuracy. In this paper, we extend the traditional model-based fault detection and isolation concepts such as residuals, and detectable and isolable faults to the data-driven domain. We then propose an algorithm to automatically generate residuals from the normal operating data. We present the performance of our proposed approach through a comparative case study.


SYGMA: System for Generalizable Modular Question Answering OverKnowledge Bases

arXiv.org Artificial Intelligence

Knowledge Base Question Answering (KBQA) tasks that in-volve complex reasoning are emerging as an important re-search direction. However, most KBQA systems struggle withgeneralizability, particularly on two dimensions: (a) acrossmultiple reasoning types where both datasets and systems haveprimarily focused on multi-hop reasoning, and (b) across mul-tiple knowledge bases, where KBQA approaches are specif-ically tuned to a single knowledge base. In this paper, wepresent SYGMA, a modular approach facilitating general-izability across multiple knowledge bases and multiple rea-soning types. Specifically, SYGMA contains three high levelmodules: 1) KB-agnostic question understanding module thatis common across KBs 2) Rules to support additional reason-ing types and 3) KB-specific question mapping and answeringmodule to address the KB-specific aspects of the answer ex-traction. We demonstrate effectiveness of our system by evalu-ating on datasets belonging to two distinct knowledge bases,DBpedia and Wikidata. In addition, to demonstrate extensi-bility to additional reasoning types we evaluate on multi-hopreasoning datasets and a new Temporal KBQA benchmarkdataset on Wikidata, namedTempQA-WD1, introduced in thispaper. We show that our generalizable approach has bettercompetetive performance on multiple datasets on DBpediaand Wikidata that requires both multi-hop and temporal rea-soning


Every time I fire a conversational designer, the performance of the dialog system goes down

arXiv.org Artificial Intelligence

Incorporating explicit domain knowledge into neural-based task-oriented dialogue systems is an effective way to reduce the need of large sets of annotated dialogues. In this paper, we investigate how the use of explicit domain knowledge of conversational designers affects the performance of neural-based dialogue systems. To support this investigation, we propose the Conversational-Logic-Injection-in-Neural-Network system (CLINN) where explicit knowledge is coded in semi-logical rules. By using CLINN, we evaluated semi-logical rules produced by a team of differently skilled conversational designers. We experimented with the Restaurant topic of the MultiWOZ dataset. Results show that external knowledge is extremely important for reducing the need of annotated examples for conversational systems. In fact, rules from conversational designers used in CLINN significantly outperform a state-of-the-art neural-based dialogue system.


RuleBert: Teaching Soft Rules to Pre-trained Language Models

arXiv.org Artificial Intelligence

While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge. In fact, even if information is available in the form of approximate (soft) logical rules, it is not clear how to transfer it to a PLM in order to improve its performance for deductive reasoning tasks. Here, we aim to bridge this gap by teaching PLMs how to reason with soft Horn rules. We introduce a classification task where, given facts and soft rules, the PLM should return a prediction with a probability for a given hypothesis. We release the first dataset for this task, and we propose a revised loss function that enables the PLM to learn how to predict precise probabilities for the task. Our evaluation results show that the resulting fine-tuned models achieve very high performance, even on logical rules that were unseen at training. Moreover, we demonstrate that logical notions expressed by the rules are transferred to the fine-tuned model, yielding state-of-the-art results on external datasets.