"An ontology defines the terms used to describe and represent an area of knowledge. … Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them."
– from OWL Web Ontology Language Use Cases and Requirements. W3C Recommendation (10 February 2004). Jeff Heflin, editor.
Recent approaches of computer vision utilize deep learning methods as they perform quite well if training and testing domains follow the same underlying data distribution. However, it has been shown that minor variations in the images that occur when using these methods in the real world can lead to unpredictable errors. Transfer learning is the area of machine learning that tries to prevent these errors. Especially, approaches that augment image data using auxiliary knowledge encoded in language embeddings or knowledge graphs (KGs) have achieved promising results in recent years. This survey focuses on visual transfer learning approaches using KGs. KGs can represent auxiliary knowledge either in an underlying graph-structured schema or in a vector-based knowledge graph embedding. Intending to enable the reader to solve visual transfer learning problems with the help of specific KG-DL configurations we start with a description of relevant modeling structures of a KG of various expressions, such as directed labeled graphs, hypergraphs, and hyper-relational graphs. We explain the notion of feature extractor, while specifically referring to visual and semantic features. We provide a broad overview of knowledge graph embedding methods and describe several joint training objectives suitable to combine them with high dimensional visual embeddings. The main section introduces four different categories on how a KG can be combined with a DL pipeline: 1) Knowledge Graph as a Reviewer; 2) Knowledge Graph as a Trainee; 3) Knowledge Graph as a Trainer; and 4) Knowledge Graph as a Peer. To help researchers find evaluation benchmarks, we provide an overview of generic KGs and a set of image processing datasets and benchmarks including various types of auxiliary knowledge. Last, we summarize related surveys and give an outlook about challenges and open issues for future research.
This paper introduces a formal definition of a Cyber-Physical System (CPS) in the spirit of the CPS Framework proposed by the National Institute of Standards and Technology (NIST). It shows that using this definition, various problems related to concerns in a CPS can be precisely formalized and implemented using Answer Set Programming (ASP). These include problems related to the dependency or conflicts between concerns, how to mitigate an issue, and what the most suitable mitigation strategy for a given issue would be. It then shows how ASP can be used to develop an implementation that addresses the aforementioned problems. The paper concludes with a discussion of the potentials of the proposed methodologies.
Besides entity-centric knowledge, usually organized as Knowledge Graph (KG), events are also an essential kind of knowledge in the world, which trigger the spring up of event-centric knowledge representation form like Event KG (EKG). It plays an increasingly important role in many machine learning and artificial intelligence applications, such as intelligent search, question-answering, recommendation, and text generation. This paper provides a comprehensive survey of EKG from history, ontology, instance, and application views. Specifically, to characterize EKG thoroughly, we focus on its history, definitions, schema induction, acquisition, related representative graphs/systems, and applications. The development processes and trends are studied therein. We further summarize perspective directions to facilitate future research on EKG.
Machine learning methods especially deep neural networks have achieved great success but many of them often rely on a number of labeled samples for training. In real-world applications, we often need to address sample shortage due to e.g., dynamic contexts with emerging prediction targets and costly sample annotation. Therefore, low-resource learning, which aims to learn robust prediction models with no enough resources (especially training samples), is now being widely investigated. Among all the low-resource learning studies, many prefer to utilize some auxiliary information in the form of Knowledge Graph (KG), which is becoming more and more popular for knowledge representation, to reduce the reliance on labeled samples. In this survey, we very comprehensively reviewed over $90$ papers about KG-aware research for two major low-resource learning settings -- zero-shot learning (ZSL) where new classes for prediction have never appeared in training, and few-shot learning (FSL) where new classes for prediction have only a small number of labeled samples that are available. We first introduced the KGs used in ZSL and FSL studies as well as the existing and potential KG construction solutions, and then systematically categorized and summarized KG-aware ZSL and FSL methods, dividing them into different paradigms such as the mapping-based, the data augmentation, the propagation-based and the optimization-based. We next presented different applications, including not only KG augmented tasks in Computer Vision and Natural Language Processing (e.g., image classification, text classification and knowledge extraction), but also tasks for KG curation (e.g., inductive KG completion), and some typical evaluation resources for each task. We eventually discussed some challenges and future directions on aspects such as new learning and reasoning paradigms, and the construction of high quality KGs.
We have created a knowledge graph based on major data sources used in ecotoxicological risk assessment. We have applied this knowledge graph to an important task in risk assessment, namely chemical effect prediction. We have evaluated nine knowledge graph embedding models from a selection of geometric, decomposition, and convolutional models on this prediction task. We show that using knowledge graph embeddings can increase the accuracy of effect prediction with neural networks. Furthermore, we have implemented a fine-tuning architecture which adapts the knowledge graph embeddings to the effect prediction task and leads to a better performance. Finally, we evaluate certain characteristics of the knowledge graph embedding models to shed light on the individual model performance.
Wikidata is a frequently updated, community-driven, and multilingual knowledge graph. Hence, Wikidata is an attractive basis for Entity Linking, which is evident by the recent increase in published papers. This survey focuses on four subjects: (1) Which Wikidata Entity Linking datasets exist, how widely used are they and how are they constructed? (2) Do the characteristics of Wikidata matter for the design of Entity Linking datasets and if so, how? (3) How do current Entity Linking approaches exploit the specific characteristics of Wikidata? (4) Which Wikidata characteristics are unexploited by existing Entity Linking approaches? This survey reveals that current Wikidata-specific Entity Linking datasets do not differ in their annotation scheme from schemes for other knowledge graphs like DBpedia. Thus, the potential for multilingual and time-dependent datasets, naturally suited for Wikidata, is not lifted. Furthermore, we show that most Entity Linking approaches use Wikidata in the same way as any other knowledge graph missing the chance to leverage Wikidata-specific characteristics to increase quality. Almost all approaches employ specific properties like labels and sometimes descriptions but ignore characteristics such as the hyper-relational structure. Hence, there is still room for improvement, for example, by including hyper-relational graph embeddings or type information. Many approaches also include information from Wikipedia, which is easily combinable with Wikidata and provides valuable textual information, which Wikidata lacks.
We present an introduction and a review of the Shapes Constraint Language (SHACL), the W3C recommendation language for validating RDF data. A SHACL document describes a set of constraints on RDF nodes, and a graph is valid with respect to the document if its nodes satisfy these constraints. We revisit the basic concepts of the language, its constructs and components and their interaction. We review the different formal frameworks used to study this language and the different semantics proposed. We examine a number of related problems, from containment and satisfiability to the interaction of SHACL with inference rules, and exhibit how different modellings of the language are useful for different problems. We also cover practical aspects of SHACL, discussing its implementations and state of adoption, to present a holistic review useful to practitioners and theoreticians alike.
Narrative cartography is a discipline which studies the interwoven nature of stories and maps. However, conventional geovisualization techniques of narratives often encounter several prominent challenges, including the data acquisition & integration challenge and the semantic challenge. To tackle these challenges, in this paper, we propose the idea of narrative cartography with knowledge graphs (KGs). Firstly, to tackle the data acquisition & integration challenge, we develop a set of KG-based GeoEnrichment toolboxes to allow users to search and retrieve relevant data from integrated cross-domain knowledge graphs for narrative mapping from within a GISystem. With the help of this tool, the retrieved data from KGs are directly materialized in a GIS format which is ready for spatial analysis and mapping. Two use cases - Magellan's expedition and World War II - are presented to show the effectiveness of this approach. In the meantime, several limitations are identified from this approach, such as data incompleteness, semantic incompatibility, and the semantic challenge in geovisualization. For the later two limitations, we propose a modular ontology for narrative cartography, which formalizes both the map content (Map Content Module) and the geovisualization process (Cartography Module). We demonstrate that, by representing both the map content and the geovisualization process in KGs (an ontology), we can realize both data reusability and map reproducibility for narrative cartography.
We review the scholarly contributions that utilise Natural Language Processing (NLP) methods to support the design process. Using a heuristic approach, we collected 223 articles published in 32 journals and within the period 1991-present. We present state-of-the-art NLP in-and-for design research by reviewing these articles according to the type of natural language text sources: internal reports, design concepts, discourse transcripts, technical publications, consumer opinions, and others. Upon summarizing and identifying the gaps in these contributions, we utilise an existing design innovation framework to identify the applications that are currently being supported by NLP. We then propose a few methodological and theoretical directions for future NLP in-and-for design research.
The objective of automated Question Answering (QA) systems is to provide answers to user queries in a time efficient manner. The answers are usually found in either databases (or knowledge bases) or a collection of documents commonly referred to as the corpus. In the past few decades there has been a proliferation of acquisition of knowledge and consequently there has been an exponential growth in new scientific articles in the field of biomedicine. Therefore, it has become difficult to keep track of all the information in the domain, even for domain experts. With the improvements in commercial search engines, users can type in their queries and get a small set of documents most relevant for answering their query, as well as relevant snippets from the documents in some cases. However, it may be still tedious and time consuming to manually look for the required information or answers. This has necessitated the development of efficient QA systems which aim to find exact and precise answers to user provided natural language questions in the domain of biomedicine. In this paper, we introduce the basic methodologies used for developing general domain QA systems, followed by a thorough investigation of different aspects of biomedical QA systems, including benchmark datasets and several proposed approaches, both using structured databases and collection of texts. We also explore the limitations of current systems and explore potential avenues for further advancement.