The showy science projects get all the attention in the constant quest to automate everything. That includes gigantic natural language processing models such as OpenAI's GPT-3, which can complete sentences, answer questions, and even write poetry. For those making commercial software, there is a more mundane but perhaps equally valuable task, which is to figure out what facts a machine should have access to and make that actually have value for humans. "We don't apologize for the fact that some of this requires brute force," says Dan Turchin, chief executive and co-founder of PeopleReign, a San Jose, California software startup that is automating the handling of support calls for things such as IT and benefits. His software has compiled, over a period of five years, a kind of encyclopedia of more than five million "domain concepts," structured information relating to things such as employee benefits, requests for computer support, and all manner of other things customers or employees might request, culled from a billion examples such as IT tickets, wikis, chat transcripts, etc.
The use of Semantic Technologies - in particular the Semantic Web - has revealed to be a great tool for describing the cultural heritage domain and artistic practices. However, the panorama of ontologies for musicological applications seems to be limited and restricted to specific applications. In this research, we propose HaMSE, an ontology capable of describing musical features that can assist musicological research. More specifically, HaMSE proposes to address issues that have been affecting musicological research for decades: the representation of music and the relationship between quantitative and qualitative data. To do this, HaMSE allows the alignment between different music representation systems and describes a set of musicological features that can allow the music analysis at different granularity levels.
Recent approaches of computer vision utilize deep learning methods as they perform quite well if training and testing domains follow the same underlying data distribution. However, it has been shown that minor variations in the images that occur when using these methods in the real world can lead to unpredictable errors. Transfer learning is the area of machine learning that tries to prevent these errors. Especially, approaches that augment image data using auxiliary knowledge encoded in language embeddings or knowledge graphs (KGs) have achieved promising results in recent years. This survey focuses on visual transfer learning approaches using KGs. KGs can represent auxiliary knowledge either in an underlying graph-structured schema or in a vector-based knowledge graph embedding. Intending to enable the reader to solve visual transfer learning problems with the help of specific KG-DL configurations we start with a description of relevant modeling structures of a KG of various expressions, such as directed labeled graphs, hypergraphs, and hyper-relational graphs. We explain the notion of feature extractor, while specifically referring to visual and semantic features. We provide a broad overview of knowledge graph embedding methods and describe several joint training objectives suitable to combine them with high dimensional visual embeddings. The main section introduces four different categories on how a KG can be combined with a DL pipeline: 1) Knowledge Graph as a Reviewer; 2) Knowledge Graph as a Trainee; 3) Knowledge Graph as a Trainer; and 4) Knowledge Graph as a Peer. To help researchers find evaluation benchmarks, we provide an overview of generic KGs and a set of image processing datasets and benchmarks including various types of auxiliary knowledge. Last, we summarize related surveys and give an outlook about challenges and open issues for future research.
Business process (BP) analysis represents a first key phase of information system development. It consists in the gathering of domain knowledge and its organization to be later used in the software development, and beyond (e.g., for Business Process Reengineering). The quality of the developed information system largely depends on how the BP analysis has been carried out and the quality of the produced requirement specification documents. Despite the fact that the issue is on the table for decades, business process analysis is still a critical phase of information systems development. One promising strategy is an early and more important involvement of business experts in the BP analysis. This paper presents a methodology that aims at an early involvement of business experts while providing a formal grounding that guarantees the quality of the produced specifications. To this end, we propose the Business Process Analysis Canvas, a knowledge framework organized in eight knowledge sections aimed at supporting the business expert in carrying out the analysis, eventually yielding a BP analysis Ontology.
Ontologies have been known for their semantic representation of knowledge. ontologies cannot automatically evolve to reflect updates that occur in respective domains. To address this limitation, researchers have called for automatic ontology generation from unstructured text corpus. Unfortunately, systems that aim to generate ontologies from unstructured text corpus are domain-specific and require manual intervention. In addition, they suffer from uncertainty in creating concept linkages and difficulty in finding axioms for the same concept. Knowledge Graphs (KGs) has emerged as a powerful model for the dynamic representation of knowledge. However, KGs have many quality limitations and need extensive refinement. This research aims to develop a novel domain-independent automatic ontology generation framework that converts unstructured text corpus into domain consistent ontological form. The framework generates KGs from unstructured text corpus as well as refine and correct them to be consistent with domain ontologies. The power of the proposed automatically generated ontology is that it integrates the dynamic features of KGs and the quality features of ontologies.
This paper introduces a formal definition of a Cyber-Physical System (CPS) in the spirit of the CPS Framework proposed by the National Institute of Standards and Technology (NIST). It shows that using this definition, various problems related to concerns in a CPS can be precisely formalized and implemented using Answer Set Programming (ASP). These include problems related to the dependency or conflicts between concerns, how to mitigate an issue, and what the most suitable mitigation strategy for a given issue would be. It then shows how ASP can be used to develop an implementation that addresses the aforementioned problems. The paper concludes with a discussion of the potentials of the proposed methodologies.
It is commonly acknowledged that the availability of the huge amount of (training) data is one of the most important factors for many recent advances in Artificial Intelligence (AI). However, datasets are often designed for specific tasks in narrow AI sub areas and there is no unified way to manage and access them. This not only creates unnecessary overheads when training or deploying Machine Learning models but also limits the understanding of the data, which is very important for data-centric AI. In this paper, we present our vision about a unified framework for different datasets so that they can be integrated and queried easily, e.g., using standard query languages. We demonstrate this in our ongoing work to create a framework for datasets in Computer Vision and show its advantages in different scenarios.
OWLOOP is an Application Programming Interface (API) for using the Ontology Web Language (OWL) by the means of Object-Oriented Programming (OOP). It is common to design software architectures using the OOP paradigm for increasing their modularity. If the components of an architecture also exploit OWL ontologies for knowledge representation and reasoning, they would require to be interfaced with OWL axioms. Since OWL does not adhere to the OOP paradigm, such an interface often leads to boilerplate code affecting modularity, and OWLOOP is designed to address this issue as well as the associated computational aspects. We present an extension of the OWL-API to provide a general-purpose interface between OWL axioms subject to reasoning and modular OOP objects hierarchies. This manuscript has been submitted to the SoftwareX Elsevier journal on the 12th of January 2021, revised on the 18th of November 2021, accepted on the 14th of December 2021, and published on the 30th of December 2021.
Besides entity-centric knowledge, usually organized as Knowledge Graph (KG), events are also an essential kind of knowledge in the world, which trigger the spring up of event-centric knowledge representation form like Event KG (EKG). It plays an increasingly important role in many machine learning and artificial intelligence applications, such as intelligent search, question-answering, recommendation, and text generation. This paper provides a comprehensive survey of EKG from history, ontology, instance, and application views. Specifically, to characterize EKG thoroughly, we focus on its history, definitions, schema induction, acquisition, related representative graphs/systems, and applications. The development processes and trends are studied therein. We further summarize perspective directions to facilitate future research on EKG.
Machine learning methods especially deep neural networks have achieved great success but many of them often rely on a number of labeled samples for training. In real-world applications, we often need to address sample shortage due to e.g., dynamic contexts with emerging prediction targets and costly sample annotation. Therefore, low-resource learning, which aims to learn robust prediction models with no enough resources (especially training samples), is now being widely investigated. Among all the low-resource learning studies, many prefer to utilize some auxiliary information in the form of Knowledge Graph (KG), which is becoming more and more popular for knowledge representation, to reduce the reliance on labeled samples. In this survey, we very comprehensively reviewed over $90$ papers about KG-aware research for two major low-resource learning settings -- zero-shot learning (ZSL) where new classes for prediction have never appeared in training, and few-shot learning (FSL) where new classes for prediction have only a small number of labeled samples that are available. We first introduced the KGs used in ZSL and FSL studies as well as the existing and potential KG construction solutions, and then systematically categorized and summarized KG-aware ZSL and FSL methods, dividing them into different paradigms such as the mapping-based, the data augmentation, the propagation-based and the optimization-based. We next presented different applications, including not only KG augmented tasks in Computer Vision and Natural Language Processing (e.g., image classification, text classification and knowledge extraction), but also tasks for KG curation (e.g., inductive KG completion), and some typical evaluation resources for each task. We eventually discussed some challenges and future directions on aspects such as new learning and reasoning paradigms, and the construction of high quality KGs.