Ontologies
Mapping paradigm ontologies to and from the brain
Schwartz, Yannick, Thirion, Bertrand, Varoquaux, Gael
Imaging neuroscience links brain activation maps to behavior and cognition via correlational studies. Due to the nature of the individual experiments, based on eliciting neural response from a small number of stimuli, this link is incomplete, and unidirectional from the causal point of view. To come to conclusions on the function implied by the activation of brain regions, it is necessary to combine a wide exploration of the various brain functions and some inversion of the statistical inference. Here we introduce a methodology for accumulating knowledge towards a bidirectional link between observed brain activity and the corresponding function. We rely on a large corpus of imaging studies and a predictive engine. Technically, the challenges are to find commonality between the studies without denaturing the richness of the corpus. The key elements that we contribute are labeling the tasks performed with a cognitive ontology, and modeling the long tail of rare paradigms in the corpus. To our knowledge, our approach is the first demonstration of predicting the cognitive content of completely new brain images. To that end, we propose a method that predicts the experimental paradigms across different studies.
Description Logics based Formalization of Wh-Queries
Dasgupta, Sourish, KaPatel, Rupali, Padia, Ankur, Shah, Kushal
The problem of Natural Language Query Formalization (NLQF) is to translate a given user query in natural language (NL) into a formal language so that the semantic interpretation has equivalence with the NL interpretation. Formalization of NL queries enables logic based reasoning during information retrieval, database query, question-answering, etc. Formalization also helps in Web query normalization and indexing, query intent analysis, etc. In this paper we are proposing a Description Logics based formal methodology for wh-query intent (also called desire) identification and corresponding formal translation. We evaluated the scalability of our proposed formalism using Microsoft Encarta 98 query dataset and OWL-S TC v.4.0 dataset.
The SP theory of intelligence: benefits and applications
This article describes existing and expected benefits of the "SP theory of intelligence", and some potential applications. The theory aims to simplify and integrate ideas across artificial intelligence, mainstream computing, and human perception and cognition, with information compression as a unifying theme. It combines conceptual simplicity with descriptive and explanatory power across several areas of computing and cognition. In the "SP machine" -- an expression of the SP theory which is currently realized in the form of a computer model -- there is potential for an overall simplification of computing systems, including software. The SP theory promises deeper insights and better solutions in several areas of application including, most notably, unsupervised learning, natural language processing, autonomous robots, computer vision, intelligent databases, software engineering, information compression, medical diagnosis and big data. There is also potential in areas such as the semantic web, bioinformatics, structuring of documents, the detection of computer viruses, data fusion, new kinds of computer, and the development of scientific theories. The theory promises seamless integration of structures and functions within and between different areas of application. The potential value, worldwide, of these benefits and applications is at least $190 billion each year. Further development would be facilitated by the creation of a high-parallel, open-source version of the SP machine, available to researchers everywhere.
Persistence, Change, and the Integration of Objects and Processes in the Framework of the General Formal Ontology
In this paper we discuss various problems, associated to temporal phenomena. These problems include persistence and change, the integration of objects and processes, and truth-makers for temporal propositions. We propose an approach which interprets persistence as a phenomenon emanating from the activity of the mind, and which, additionally, postulates that persistence, finally, rests on personal identity. The General Formal Ontology (GFO) is a top level ontology being developed at the University of Leipzig. Top level ontologies can be roughly divided into 3D-ontologies, and 4D-ontologies. GFO is the only top level ontology, used in applications, which is a 4D-ontology admitting additionally 3D objects. Objects and processes are integrated in a natural way.
Mapping cognitive ontologies to and from the brain
Schwartz, Yannick, Thirion, Bertrand, Varoquaux, Gaรซl
Imaging neuroscience links brain activation maps to behavior and cognition via correlational studies. Due to the nature of the individual experiments, based on eliciting neural response from a small number of stimuli, this link is incomplete, and unidirectional from the causal point of view. To come to conclusions on the function implied by the activation of brain regions, it is necessary to combine a wide exploration of the various brain functions and some inversion of the statistical inference. Here we introduce a methodology for accumulating knowledge towards a bidirectional link between observed brain activity and the corresponding function. We rely on a large corpus of imaging studies and a predictive engine. Technically, the challenges are to find commonality between the studies without denaturing the richness of the corpus. The key elements that we contribute are labeling the tasks performed with a cognitive ontology, and modeling the long tail of rare paradigms in the corpus. To our knowledge, our approach is the first demonstration of predicting the cognitive content of completely new brain images. To that end, we propose a method that predicts the experimental paradigms across different studies.
Semantics for Big Data Integration and Analysis
Knoblock, Craig A. (University of Southern California) | Szekely, Pedro (University of Southern California)
Much of the focus on big data has been on the problem of processing very large sources. ย There is an equally hard problem of how to normalize, integrate, and transform the data from many sources into the format required to run large-scale analysis and visualization tools. ย We have previously developed an approach to semi-automatically mapping diverse sources into a shared domain ontology so that they can be quickly combined. ย In this paper we describe our approach to building and executing integration and restructuring plans to support analysis and visualization tools on very large and diverse datasets.
Unsupervised Rating Prediction based on Local and Global Semantic Models
Boteanu, Adrian (Worcester Polytechnic Institute) | Chernova, Sonia (Worcester Polytechnic Institute)
Current recommendation engines attempt to answer the same question: given a user with some activity in the system, which is the next entity, be it a restaurant, a book or a movie, that the user should visit or buy next. The presumption is that the user would favorably review the item being recommended. The goal of our project is to predict how a user would rate an item he/she never rated, which is a generalization of the task recommendation engines perform. Previous work successfully employs machine learning techniques, particularly statistical methods. However, there are some outlier situations which are more difficult to predict, such as new users. In this paper we present a rating prediction approach targeted for entities for which little prior information exists in the database.We put forward and test a number of hypotheses, exploring recommendations based on nearest neighbor-like methods. We adapt existing common sense topic modeling methods to compute similarity measures between users and then use a relatively small set of key users to predict how the target user will rate a given business. We implemented and tested our system for recommending businesses using the Yelp Academic Dataset. We report initial results for topic-based rating predictions, which perform consistently across a broad range of parameters.
Developing Semantic Classifiers for Big Data
Scherl, Richard (Monmouth University)
When the amount of RDF data is very large, it becomes more likely that the triples describing entities will contain errors and may not include the specification of a class from a known ontology. The work presented here explores the utilization of methods from machine learning to develop classifiers for identifying the semantic categorization of entities based upon the property names used to describe the entity. The goal is to develop classifiers that are accurate, but robust to errors and noise. The training data comes from DBpedia, where entities are categorized by type and densely described with RDF properties. The initial experimentation reported here indicates that the approach is promising.
Entity Type Recognition for Heterogeneous Semantic Graphs
Sleeman, Jennifer (University of Maryland, Baltimore County) | Finin, Tim (University of Maryland, Baltimore County)
We describe an approach to reducing the computational cost of identifying coreferent instances in heterogeneous semantic graphs where the underlying ontologies may not be informative or even known. The problem is similar to coreference resolution in unstructured text, where a variety of linguistic clues and contextual information is used to infer entity types and predict coreference. Semantic graphs, whether in RDF or another formalism, are semi-structured data with very different contextual clues and need different approaches to identify potentially coreferent entities. When their ontologies are unknown, inaccessible or semantically trivial, coreference resolution is difficult. For such cases, we can use supervised machine learning to map entity attributes via dictionaries based on properties from an appropriate background knowledge base to predict instance entity types, aiding coreference resolution. We evaluated the approach in experiments on data from Wikipedia, Freebase and Arnetminer and DBpedia as the background knowledge base.
Unsupervised learning human's activities by overexpressed recognized non-speech sounds
Smidtas, Serge, Peyrot, Magalie
Human activity and environment produces sounds such as, at home, the noise produced by water, cough, or television. These sounds can be used to determine the activity in the environment. The objective is to monitor a person's activity or determine his environment using a single low cost microphone by sound analysis. The purpose is to adapt programs to the activity or environment or detect abnormal situations. Some patterns of over expressed repeatedly in the sequences of recognized sounds inter and intra environment allow to characterize activities such as the entrance of a person in the house, or a tv program watched. We first manually annotated 1500 sounds of daily life activity of old persons living at home recognized sounds. Then we inferred an ontology and enriched the database of annotation with a crowed sourced manual annotation of 7500 sounds to help with the annotation of the most frequent sounds. Using learning sound algorithms, we defined 50 types of the most frequent sounds. We used this set of recognizable sounds as a base to tag sounds and put tags on them. By using over expressed number of motifs of sequences of the tags, we were able to categorize using only a single low-cost microphone, complex activities of daily life of a persona at home as watching TV, entrance in the apartment of a person, or phone conversation including detecting unknown activities as repeated tasks performed by users.