Expert Systems
An Expert System to Diagnose Spinal Disorders
Dashti, Seyed Mohammad Sadegh, Dashti, Seyedeh Fatemeh
Objective: Until now, traditional invasive approaches have been the only means being leveraged to diagnose spinal disorders. Traditional manual diagnostics require a high workload, and diagnostic errors are likely to occur due to the prolonged work of physicians. In this research, we develop an expert system based on a hybrid inference algorithm and comprehensive integrated knowledge for assisting the experts in the fast and high-quality diagnosis of spinal disorders. Methods: First, for each spinal anomaly, the accurate and integrated knowledge was acquired from related experts and resources. Second, based on probability distributions and dependencies between symptoms of each anomaly, a unique numerical value known as certainty effect value was assigned to each symptom. Third, a new hybrid inference algorithm was designed to obtain excellent performance, which was an incorporation of the Backward Chaining Inference and Theory of Uncertainty. Results: The proposed expert system was evaluated in two different phases, real-world samples, and medical records evaluation. Evaluations show that in terms of real-world samples analysis, the system achieved excellent accuracy. Application of the system on the sample with anomalies revealed the degree of severity of disorders and the risk of development of abnormalities in unhealthy and healthy patients. In the case of medical records analysis, our expert system proved to have promising performance, which was very close to those of experts. Conclusion: Evaluations suggest that the proposed expert system provides promising performance, helping specialists to validate the accuracy and integrity of their diagnosis. It can also serve as an intelligent educational software for medical students to gain familiarity with spinal disorder diagnosis process, and related symptoms.
The Monitor Model and its Misconceptions: A Clarification
Horizontal (automatic) and vertical (control) processes have been observed and reported for a long time in translation production. Schaeffer and Carl's Monitor Model integrates these two processes into one framework, assuming that priming mechanisms underlie horizontal/automatic processes, while vertical/monitoring processes implement consciously accessible control mechanisms. The Monitor Model has been criticized in various ways and several misconceptions have accumulated over the past years. In this chapter, I update the Monitor Model with additional evidence and argue that it is compatible with an enactivist approach to cognition. I address several misconceptions related to the Monitor Model.
Knowledge-enhanced Neural Machine Reasoning: A Review
Chowdhury, Tanmoy, Ling, Chen, Zhang, Xuchao, Zhao, Xujiang, Bai, Guangji, Pei, Jian, Chen, Haifeng, Zhao, Liang
Knowledge-enhanced neural machine reasoning has garnered significant attention as a cutting-edge yet challenging research area with numerous practical applications. Over the past few years, plenty of studies have leveraged various forms of external knowledge to augment the reasoning capabilities of deep models, tackling challenges such as effective knowledge integration, implicit knowledge mining, and problems of tractability and optimization. However, there is a dearth of a comprehensive technical review of the existing knowledge-enhanced reasoning techniques across the diverse range of application domains. This survey provides an in-depth examination of recent advancements in the field, introducing a novel taxonomy that categorizes existing knowledge-enhanced methods into two primary categories and four subcategories. We systematically discuss these methods and highlight their correlations, strengths, and limitations. Finally, we elucidate the current application domains and provide insight into promising prospects for future research.
A survey on knowledge-enhanced multimodal learning
Lymperaiou, Maria, Stamou, Giorgos
Multimodal learning has been a field of increasing interest, aiming to combine various modalities in a single joint representation. Especially in the area of visiolinguistic (VL) learning multiple models and techniques have been developed, targeting a variety of tasks that involve images and text. VL models have reached unprecedented performances by extending the idea of Transformers, so that both modalities can learn from each other. Massive pre-training procedures enable VL models to acquire a certain level of real-world understanding, although many gaps can be identified: the limited comprehension of commonsense, factual, temporal and other everyday knowledge aspects questions the extendability of VL tasks. Knowledge graphs and other knowledge sources can fill those gaps by explicitly providing missing information, unlocking novel capabilities of VL models. In the same time, knowledge graphs enhance explainability, fairness and validity of decision making, issues of outermost importance for such complex implementations. The current survey aims to unify the fields of VL representation learning and knowledge graphs, and provides a taxonomy and analysis of knowledge-enhanced VL models.
TempEL: Linking Dynamically Evolving and Newly Emerging Entities
Zaporojets, Klim, Kaffee, Lucie-Aimee, Deleu, Johannes, Demeester, Thomas, Develder, Chris, Augenstein, Isabelle
In our continuously evolving world, entities change over time and new, previously non-existing or unknown, entities appear. We study how this evolutionary scenario impacts the performance on a well established entity linking (EL) task. For that study, we introduce TempEL, an entity linking dataset that consists of time-stratified English Wikipedia snapshots from 2013 to 2022, from which we collect both anchor mentions of entities, and these target entities' descriptions. By capturing such temporal aspects, our newly introduced TempEL resource contrasts with currently existing entity linking datasets, which are composed of fixed mentions linked to a single static version of a target Knowledge Base (e.g., Wikipedia 2010 for CoNLL-AIDA). Indeed, for each of our collected temporal snapshots, TempEL contains links to entities that are continual, i.e., occur in all of the years, as well as completely new entities that appear for the first time at some point. Thus, we enable to quantify the performance of current state-of-the-art EL models for: (i) entities that are subject to changes over time in their Knowledge Base descriptions as well as their mentions' contexts, and (ii) newly created entities that were previously non-existing (e.g., at the time the EL model was trained). Our experimental results show that in terms of temporal performance degradation, (i) continual entities suffer a decrease of up to 3.1% EL accuracy, while (ii) for new entities this accuracy drop is up to 17.9%. This highlights the challenge of the introduced TempEL dataset and opens new research prospects in the area of time-evolving entity disambiguation.
Real vs. Fake AI - How to Spot the Difference
Artificial Intelligence (AI) has come a long way in recent years and is rapidly changing the way we live and work. While AI has the potential to greatly benefit society, it's important to understand that not all AI is created equal. Alin Turing's 1950 paper "Computing Machinery and Intelligence" and its subsequent Turing Test established the fundamental goal and vision of AI. Real artificial Intelligence can be divided into several subsets, including machine learning, deep learning, natural language processing, expert system, robotics, machine vision and speech recognition. These subsets of AI are continuously evolving and expanding as technology advances, creating new and exciting possibilities in the field of AI.
The Construction of Reality in an AI: A Review
AI constructivism as inspired by Jean Piaget, described and surveyed by Frank Guerin, and representatively implemented by Gary Drescher seeks to create algorithms and knowledge structures that enable agents to acquire, maintain, and apply a deep understanding of the environment through sensorimotor interactions. This paper aims to increase awareness of constructivist AI implementations to encourage greater progress toward enabling lifelong learning by machines. It builds on Guerin's 2008 "Learning Like a Baby: A Survey of AI approaches." After briefly recapitulating that survey, it summarizes subsequent progress by the Guerin referents, numerous works not covered by Guerin (or found in other surveys), and relevant efforts in related areas. The focus is on knowledge representations and learning algorithms that have been used in practice viewed through lenses of Piaget's schemas, adaptation processes, and staged development. The paper concludes with a preview of a simple framework for constructive AI being developed by the author that parses concepts from sensory input and stores them in a semantic memory network linked to episodic data.
TwinExplainer: Explaining Predictions of an Automotive Digital Twin
Neupane, Subash, Fernandez, Ivan A., Patterson, Wilson, Mittal, Sudip, Parmar, Milan, Rahimi, Shahram
Vehicles are complex Cyber Physical Systems (CPS) that operate in a variety of environments, and the likelihood of failure of one or more subsystems, such as the engine, transmission, brakes, and fuel, can result in unscheduled downtime and incur high maintenance or repair costs. In order to prevent these issues, it is crucial to continuously monitor the health of various subsystems and identify abnormal sensor channel behavior. Data-driven Digital Twin (DT) systems are capable of such a task. Current DT technologies utilize various Deep Learning (DL) techniques that are constrained by the lack of justification or explanation for their predictions. This inability of these opaque systems can influence decision-making and raises user trust concerns. This paper presents a solution to this issue, where the TwinExplainer system, with its three-layered architectural pipeline, explains the predictions of an automotive DT. Such a system can assist automotive stakeholders in understanding the global scale of the sensor channels and how they contribute towards generic DT predictions. TwinExplainer can also visualize explanations for both normal and abnormal local predictions computed by the DT.
Crawling the Internal Knowledge-Base of Language Models
Cohen, Roi, Geva, Mor, Berant, Jonathan, Globerson, Amir
Language models are trained on large volumes of text, and as a result their parameters might contain a significant body of factual knowledge. Any downstream task performed by these models implicitly builds on these facts, and thus it is highly desirable to have means for representing this body of knowledge in an interpretable way. However, there is currently no mechanism for such a representation. Here, we propose to address this goal by extracting a knowledge-graph of facts from a given language model. We describe a procedure for ``crawling'' the internal knowledge-base of a language model. Specifically, given a seed entity, we expand a knowledge-graph around it. The crawling procedure is decomposed into sub-tasks, realized through specially designed prompts that control for both precision (i.e., that no wrong facts are generated) and recall (i.e., the number of facts generated). We evaluate our approach on graphs crawled starting from dozens of seed entities, and show it yields high precision graphs (82-92%), while emitting a reasonable number of facts per entity.
A Cohesive Distillation Architecture for Neural Language Models
A recent trend in Natural Language Processing is the exponential growth in Language Model (LM) size, which prevents research groups without a necessary hardware infrastructure from participating in the development process. This study investigates methods for Knowledge Distillation (KD) to provide efficient alternatives to large-scale models. In this context, KD means extracting information about language encoded in a Neural Network and Lexical Knowledge Databases. We developed two methods to test our hypothesis that efficient architectures can gain knowledge from LMs and extract valuable information from lexical sources. First, we present a technique to learn confident probability distribution for Masked Language Modeling by prediction weighting of multiple teacher networks. Second, we propose a method for Word Sense Disambiguation (WSD) and lexical KD that is general enough to be adapted to many LMs. Our results show that KD with multiple teachers leads to improved training convergence. When using our lexical pre-training method, LM characteristics are not lost, leading to increased performance in Natural Language Understanding (NLU) tasks over the state-of-the-art while adding no parameters. Moreover, the improved semantic understanding of our model increased the task performance beyond WSD and NLU in a real-problem scenario (Plagiarism Detection). This study suggests that sophisticated training methods and network architectures can be superior over scaling trainable parameters. On this basis, we suggest the research area should encourage the development and use of efficient models and rate impacts resulting from growing LM size equally against task performance.