Ontologies
The SP theory of intelligence: benefits and applications
Tel.: 44-1248-712962; 44-7746-290775 Received: 26 May 2013; in revised form: 13 December 2013 / Accepted: 13 December 2013 / Published: xx Abstract: This article describes existing and expected benefits of the SP theory of intelligence, and some potential applications. The theory aims to simplify and integrate ideas across artificial intelligence, mainstream computing, and human perception and cognition, with information compression as a unifying theme. It combines conceptual simplicity with descriptive and explanatory power across several areas of computing and cognition. In the SP machine--an expression of the SP theory which is currently realized in the form of a computer model--there is potential for an overall simplification of computing systems, including software. The SP theory promises deeper insights and better solutions in several areas of application including, most notably, unsupervised learning, natural language processing, autonomous robots, computer vision, intelligent databases, software engineering, information compression, medical diagnosis and big data. There is also potential in areas such as the semantic web, bioinformatics, structuring of documents, the detection of computer viruses, data fusion, new kinds of computer, and the development of scientific theories. The theory promises seamless integration of structures and functions within and between different areas of application. The potential value, worldwide, of these benefits and applications is at least $190 billion each year. Further development would be facilitated by the creation of a high-parallel, open-source version of the SP machine, available to researchers everywhere. Keywords: artificial intelligence; information compression; unsupervised learning; natural language processing; pattern recognition Information 2013, xx 2 1. Introduction The SP theory of intelligence aims to simplify and integrate concepts across artificial intelligence, mainstream computing and human perception and cognition, with information compression as a unifying theme. This article describes existing and expected benefits of the SP theory and some of its potential applications. The theory is described most fully in [1] and more briefly in an extended overview [2]. This article should be read in conjunction with either or both of those accounts. In brief, the existing and expected benefits of the theory are: - Conceptual simplicity combined with descriptive and explanatory power.
Exact Query Reformulation over Databases with First-order and Description Logics Ontologies
Franconi, E., Kerhet, V., Ngo, N.
We study a general framework for query rewriting in the presence of an arbitrary first-order logic ontology over a database signature. The framework supports deciding the existence of a safe-range first-order equivalent reformulation of a query in terms of the database signature, and if so, it provides an effective approach to construct the reformulation based on interpolation using standard theorem proving techniques (e.g., tableau). Since the reformulation is a safe-range formula, it is effectively executable as an SQL query. At the end, we present a non-trivial application of the framework with ontologies in the very expressive ALCHOIQ description logic, by providing effective means to compute safe-range first-order exact reformulations of queries.
Persistence, Change, and the Integration of Objects and Processes in the Framework of the General Formal Ontology
In this paper we discuss various problems, associated to temporal phenomena. These problems include persistence and change, the integration of objects and processes, and truth-makers for temporal propositions. We propose an approach which interprets persistence as a phenomenon emanating from the activity of the mind, and which, additionally, postulates that persistence, finally, rests on personal identity. The General Formal Ontology (GFO) is a top level ontology being developed at the University of Leipzig. Top level ontologies can be roughly divided into 3D-ontologies, and 4D-ontologies. GFO is the only top level ontology, used in applications, which is a 4D-ontology admitting additionally 3D objects. Objects and processes are integrated in a natural way.
Generating Natural Language Descriptions from OWL Ontologies: the NaturalOWL System
Androutsopoulos, I., Lampouras, G., Galanis, D.
We present NaturalOWL, a natural language generation system that produces texts describing individuals or classes of OWL ontologies. Unlike simpler OWL verbalizers, which typically express a single axiom at a time in controlled, often not entirely fluent natural language primarily for the benefit of domain experts, we aim to generate fluent and coherent multi-sentence texts for end-users. With a system like NaturalOWL, one can publish information in OWL on the Web, along with automatically produced corresponding texts in multiple languages, making the information accessible not only to computer programs and domain experts, but also end-users. We discuss the processing stages of NaturalOWL, the optional domain-dependent linguistic resources that the system can use at each stage, and why they are useful. We also present trials showing that when the domain-dependent llinguistic resources are available, NaturalOWL produces significantly better texts compared to a simpler verbalizer, and that the resources can be created with relatively light effort.
Mapping cognitive ontologies to and from the brain
Schwartz, Yannick, Thirion, Bertrand, Varoquaux, Gaël
Due to the nature of the individual experiments, based on eliciting neural response from a small number of stimuli, this link is incomplete, and unidirectional from the causal point of view. To come to conclusions on the function implied by the activation of brain regions, it is necessary to combine a wide exploration of the various brain functions and some inversion of the statistical inference. Here we introduce a methodology for accumulating knowledge towards a bidirectional link between observed brain activity and the corresponding function. We rely on a large corpus of imaging studies and a predictive engine. Technically, the challenges are to find commonality between the studies without denaturing the richness of the corpus. The key elements that we contribute are labeling the tasks performed with a cognitive ontology, and modeling the long tail of rare paradigms in the corpus. To our knowledge, our approach is the first demonstration of predicting the cognitive content of completely new brain images. To that end, we propose a method that predicts the experimental paradigms across different studies.
Reasoning about Explanations for Negative Query Answers in DL-Lite
Calvanese, D., Ortiz, M., Simkus, M., Stefanoni, G.
In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for Description Logics, where research has focused on the explanation of both TBox reasoning and, more recently, query answering. Besides explaining the presence of a tuple in a query answer, it is important to explain also why a given tuple is missing. We address the latter problem for instance and conjunctive query answering over DL-Lite ontologies by adopting abductive reasoning; that is, we look for additions to the ABox that force a given tuple to be in the result. As reasoning tasks we consider existence and recognition of an explanation, and relevance and necessity of a given assertion for an explanation. We characterize the computational complexity of these problems for arbitrary, subset minimal, and cardinality minimal explanations.
Unsupervised Rating Prediction based on Local and Global Semantic Models
Boteanu, Adrian (Worcester Polytechnic Institute) | Chernova, Sonia (Worcester Polytechnic Institute)
Current recommendation engines attempt to answer the same question: given a user with some activity in the system, which is the next entity, be it a restaurant, a book or a movie, that the user should visit or buy next. The presumption is that the user would favorably review the item being recommended. The goal of our project is to predict how a user would rate an item he/she never rated, which is a generalization of the task recommendation engines perform. Previous work successfully employs machine learning techniques, particularly statistical methods. However, there are some outlier situations which are more difficult to predict, such as new users. In this paper we present a rating prediction approach targeted for entities for which little prior information exists in the database.We put forward and test a number of hypotheses, exploring recommendations based on nearest neighbor-like methods. We adapt existing common sense topic modeling methods to compute similarity measures between users and then use a relatively small set of key users to predict how the target user will rate a given business. We implemented and tested our system for recommending businesses using the Yelp Academic Dataset. We report initial results for topic-based rating predictions, which perform consistently across a broad range of parameters.
Entity Type Recognition for Heterogeneous Semantic Graphs
Sleeman, Jennifer (University of Maryland, Baltimore County) | Finin, Tim (University of Maryland, Baltimore County)
We describe an approach to reducing the computational cost of identifying coreferent instances in heterogeneous semantic graphs where the underlying ontologies may not be informative or even known. The problem is similar to coreference resolution in unstructured text, where a variety of linguistic clues and contextual information is used to infer entity types and predict coreference. Semantic graphs, whether in RDF or another formalism, are semi-structured data with very different contextual clues and need different approaches to identify potentially coreferent entities. When their ontologies are unknown, inaccessible or semantically trivial, coreference resolution is difficult. For such cases, we can use supervised machine learning to map entity attributes via dictionaries based on properties from an appropriate background knowledge base to predict instance entity types, aiding coreference resolution. We evaluated the approach in experiments on data from Wikipedia, Freebase and Arnetminer and DBpedia as the background knowledge base.
Developing Semantic Classifiers for Big Data
Scherl, Richard (Monmouth University)
When the amount of RDF data is very large, it becomes more likely that the triples describing entities will contain errors and may not include the specification of a class from a known ontology. The work presented here explores the utilization of methods from machine learning to develop classifiers for identifying the semantic categorization of entities based upon the property names used to describe the entity. The goal is to develop classifiers that are accurate, but robust to errors and noise. The training data comes from DBpedia, where entities are categorized by type and densely described with RDF properties. The initial experimentation reported here indicates that the approach is promising.
Semantics for Big Data Integration and Analysis
Knoblock, Craig A. (University of Southern California) | Szekely, Pedro (University of Southern California)
Much of the focus on big data has been on the problem of processing very large sources. There is an equally hard problem of how to normalize, integrate, and transform the data from many sources into the format required to run large-scale analysis and visualization tools. We have previously developed an approach to semi-automatically mapping diverse sources into a shared domain ontology so that they can be quickly combined. In this paper we describe our approach to building and executing integration and restructuring plans to support analysis and visualization tools on very large and diverse datasets.