Expert Systems
ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing
Shen, Hua, Huang, Chieh-Yang, Wu, Tongshuang, Huang, Ting-Hao 'Kenneth'
Despite a surge collection of XAI methods, users still struggle to obtain required AI explanations. Previous research suggests chatbots as dynamic solutions, but the effective design of conversational XAI agents for practical human needs remains under-explored. This paper focuses on Conversational XAI for AI-assisted scientific writing tasks. Drawing from human linguistic theories and formative studies, we identify four design rationales: "multifaceted", "controllability", "mix-initiative", "context-aware drill-down". We incorporate them into an interactive prototype, ConvXAI, which facilitates heterogeneous AI explanations for scientific writing through dialogue. In two studies with 21 users, ConvXAI outperforms a GUI-based baseline on improving human-perceived understanding and writing improvement. The paper further discusses the practical human usage patterns in interacting with ConvXAI for scientific co-writing.
#IJCAI2023 distinguished paper: Interview with Maurice Funk – knowledge bases and querying
Maurice Funk, and co-authors Balder ten Cate, Jean Christoph Jung and Carsten Lutz, won a distinguished paper award at the 32nd International Joint Conference on Artificial Intelligence (IJCAI) for their work SAT-Based PAC Learning of Description Logic Concepts. In this interview, Maurice tells us more about knowledge bases and querying, why this is an interesting area for study, and their methodology and results. Our research is in the area of knowledge representation, or more specifically knowledge bases and querying. A knowledge base contains facts like a traditional database e.g. "Bob is a fish" and "Amelia is a dog", but also background knowledge formulated in some formal language e.g.
Systematic Comparison of Software Agents and Digital Twins: Differences, Similarities, and Synergies in Industrial Production
Reinpold, Lasse Matthias, Wagner, Lukas Peter, Gehlhoff, Felix, Ramonat, Malte, Kilthau, Maximilian, Gill, Milapji Singh, Reif, Jonathan Tobias, Henkel, Vincent, Scholz, Lena, Fay, Alexander
To achieve a highly agile and flexible production, it is envisioned that industrial production systems gradually become more decentralized, interconnected, and intelligent. Within this vision, production assets collaborate with each other, exhibiting a high degree of autonomy. Furthermore, knowledge about individual production assets is readily available throughout their entire life-cycles. To realize this vision, adequate use of information technology is required. Two commonly applied software paradigms in this context are Software Agents (referred to as Agents) and Digital Twins (DTs). This work presents a systematic comparison of Agents and DTs in industrial applications. The goal of the study is to determine the differences, similarities, and potential synergies between the two paradigms. The comparison is based on the purposes for which Agents and DTs are applied, the properties and capabilities exhibited by these software paradigms, and how they can be allocated within the Reference Architecture Model Industry 4.0. The comparison reveals that Agents are commonly employed in the collaborative planning and execution of production processes, while DTs typically play a more passive role in monitoring production resources and processing information. Although these observations imply characteristic sets of capabilities and properties for both Agents and DTs, a clear and definitive distinction between the two paradigms cannot be made. Instead, the analysis indicates that production assets utilizing a combination of Agents and DTs would demonstrate high degrees of intelligence, autonomy, sociability, and fidelity. To achieve this, further standardization is required, particularly in the field of DTs.
Open Knowledge Base Canonicalization with Multi-task Unlearning
Liu, Bingchen, Hou, Shihao, Zeng, Weixin, Zhao, Xiang, Liu, Shijun, Pan, Li
The construction of large open knowledge bases (OKBs) is integral to many applications in the field of mobile computing. Noun phrases and relational phrases in OKBs often suffer from redundancy and ambiguity, which calls for the investigation on OKB canonicalization. However, in order to meet the requirements of some privacy protection regulations and to ensure the timeliness of the data, the canonicalized OKB often needs to remove some sensitive information or outdated data. The machine unlearning in OKB canonicalization is an excellent solution to the above problem. Current solutions address OKB canonicalization by devising advanced clustering algorithms and using knowledge graph embedding (KGE) to further facilitate the canonicalization process. Effective schemes are urgently needed to fully synergise machine unlearning with clustering and KGE learning. To this end, we put forward a multi-task unlearning framework, namely MulCanon, to tackle machine unlearning problem in OKB canonicalization. Specifically, the noise characteristics in the diffusion model are utilized to achieve the effect of machine unlearning for data in OKB. MulCanon unifies the learning objectives of diffusion model, KGE and clustering algorithms, and adopts a two-step multi-task learning paradigm for training. A thorough experimental study on popular OKB canonicalization datasets validates that MulCanon achieves advanced machine unlearning effects.
Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero
Schut, Lisa, Tomasev, Nenad, McGrath, Tom, Hassabis, Demis, Paquet, Ulrich, Kim, Been
Artificial Intelligence (AI) systems have made remarkable progress, attaining super-human performance across various domains. This presents us with an opportunity to further human knowledge and improve human expert performance by leveraging the hidden knowledge encoded within these highly performant AI systems. Yet, this knowledge is often hard to extract, and may be hard to understand or learn from. Here, we show that this is possible by proposing a new method that allows us to extract new chess concepts in AlphaZero, an AI system that mastered the game of chess via self-play without human supervision. Our analysis indicates that AlphaZero may encode knowledge that extends beyond the existing human knowledge, but knowledge that is ultimately not beyond human grasp, and can be successfully learned from. In a human study, we show that these concepts are learnable by top human experts, as four top chess grandmasters show improvements in solving the presented concept prototype positions. This marks an important first milestone in advancing the frontier of human knowledge by leveraging AI; a development that could bear profound implications and help us shape how we interact with AI systems across many AI applications.
Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers
Tang, Chen, Wang, Shun, Goldsack, Tomas, Lin, Chenghua
Abstracts derived from biomedical literature possess distinct domain-specific characteristics, including specialised writing styles and biomedical terminologies, which necessitate a deep understanding of the related literature. As a result, existing language models struggle to generate technical summaries that are on par with those produced by biomedical experts, given the absence of domain-specific background knowledge. This paper aims to enhance the performance of language models in biomedical abstractive summarisation by aggregating knowledge from external papers cited within the source article. We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers, allowing neural networks to generate summaries by leveraging both the paper content and relevant knowledge from citation papers. Furthermore, we construct and release a large-scale biomedical summarisation dataset that serves as a foundation for our research. Extensive experiments demonstrate that our model outperforms state-of-the-art approaches and achieves substantial improvements in abstractive biomedical text summarisation.
Model of models -- Part 1
This paper proposes a new cognitive model, acting as the main component of an AGI agent. The model is introduced in its mature intelligence state, and as an extension of previous models, DENN, and especially AKREM, by including operational models (frames/classes) and will. This model's core assumption is that cognition is about operating on accumulated knowledge, with the guidance of an appropriate will. Also, we assume that the actions, part of knowledge, are learning to be aligned with will, during the evolution phase that precedes the mature intelligence state. In addition, this model is mainly based on the duality principle in every known intelligent aspect, such as exhibiting both top-down and bottom-up model learning, generalization verse specialization, and more. Furthermore, a holistic approach is advocated for AGI designing, and cognition under constraints or efficiency is proposed, in the form of reusability and simplicity. Finally, reaching this mature state is described via a cognitive evolution from infancy to adulthood, utilizing a consolidation principle. The final product of this cognitive model is a dynamic operational memory of models and instances. Lastly, some examples and preliminary ideas for the evolution phase to reach the mature state are presented.
Learning Informative Health Indicators Through Unsupervised Contrastive Learning
Rombach, Katharina, Michau, Gabriel, Bürzle, Wilfried, Koller, Stefan, Fink, Olga
Condition monitoring is essential to operate industrial assets safely and efficiently. To achieve this goal, the development of robust health indicators has recently attracted significant attention. These indicators, which provide quantitative real-time insights into the health status of industrial assets over time, serve as valuable tools for fault detection and prognostics. In this study, we propose a novel and universal approach to learn health indicators based on unsupervised contrastive learning. Operational time acts as a proxy for the asset's degradation state, enabling the learning of a contrastive feature space that facilitates the construction of a health indicator by measuring the distance to the healthy condition. To highlight the universality of the proposed approach, we assess the proposed contrastive learning framework in two distinct tasks - wear assessment and fault detection - across two different case studies: a milling machines case study and a real condition monitoring case study of railway wheels from operating trains. First, we evaluate if the health indicator is able to learn the real health condition on a milling machine case study where the ground truth wear condition is continuously measured. Second, we apply the proposed method on a real case study of railway wheels where the ground truth health condition is not known. Here, we evaluate the suitability of the learned health indicator for fault detection of railway wheel defects. Our results demonstrate that the proposed approach is able to learn the ground truth health evolution of milling machines and the learned health indicator is suited for fault detection of railway wheels operated under various operating conditions by outperforming state-of-the-art methods. Further, we demonstrate that our proposed approach is universally applicable to different systems and different health conditions.
InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations
Feldhus, Nils, Wang, Qianli, Anikina, Tatiana, Chopra, Sahil, Oguz, Cennet, Möller, Sebastian
While recently developed NLP explainability methods let us open the black box in various ways (Madsen et al., 2022), a missing ingredient in this endeavor is an interactive tool offering a conversational interface. Such a dialogue system can help users explore datasets and models with explanations in a contextualized manner, e.g. via clarification or follow-up questions, and through a natural language interface. We adapt the conversational explanation framework TalkToModel (Slack et al., 2022) to the NLP domain, add new NLP-specific operations such as free-text rationalization, and illustrate its generalizability on three NLP tasks (dialogue act classification, question answering, hate speech detection). To recognize user queries for explanations, we evaluate fine-tuned and few-shot prompting models and implement a novel Adapter-based approach. We then conduct two user studies on (1) the perceived correctness and helpfulness of the dialogues, and (2) the simulatability, i.e. how objectively helpful dialogical explanations are for humans in figuring out the model's predicted label when it's not shown. We found rationalization and feature attribution were helpful in explaining the model behavior. Moreover, users could more reliably predict the model outcome based on an explanation dialogue rather than one-off explanations.
Measuring vagueness and subjectivity in texts: from symbolic to neural VAGO
Icard, Benjamin, Claveau, Vincent, Atemezing, Ghislain, Égré, Paul
We present a hybrid approach to the automated measurement of vagueness and subjectivity in texts. We first introduce the expert system VAGO, we illustrate it on a small benchmark of fact vs. opinion sentences, and then test it on the larger French press corpus FreSaDa to confirm the higher prevalence of subjective markers in satirical vs. regular texts. We then build a neural clone of VAGO, based on a BERT-like architecture, trained on the symbolic VAGO scores obtained on FreSaDa. Using explainability tools (LIME), we show the interest of this neural version for the enrichment of the lexicons of the symbolic version, and for the production of versions in other languages.