Ontologies
Tasks that Require, or can Benefit from, Matching Blank Nodes
Lantzaki, Christina, Tzitzikas, Yannis
In various domains and cases, we observe the creation and usage of information elements which are unnamed. Such elements do not have a name, or may have a name that is not externally referable (usually meaningless and not persistent over time). This paper discusses why we will never `escape' from the problem of having to construct mappings between such unnamed elements in information systems. Since unnamed elements nowadays occur very often in the framework of the Semantic Web and Linked Data as blank nodes, the paper describes scenarios that can benefit from methods that compute mappings between the unnamed elements. For each scenario, the corresponding bnode matching problem is formally defined. Based on this analysis, we try to reach to more a general formulation of the problem, which can be useful for guiding the required technological advances. To this end, the paper finally discusses methods to realize blank node matching, the implementations that exist, and identifies open issues and challenges.
Realizing RCC8 networks using convex regions
Schockaert, Steven, Li, Sanjiang
RCC8 is a popular fragment of the region connection calculus, in which qualitative spatial relations between regions, such as adjacency, overlap and parthood, can be expressed. While RCC8 is essentially dimensionless, most current applications are confined to reasoning about two-dimensional or three-dimensional physical space. In this paper, however, we are mainly interested in conceptual spaces, which typically are high-dimensional Euclidean spaces in which the meaning of natural language concepts can be represented using convex regions. The aim of this paper is to analyze how the restriction to convex regions constrains the realizability of networks of RCC8 relations. First, we identify all ways in which the set of RCC8 base relations can be restricted to guarantee that consistent networks can be convexly realized in respectively 1D, 2D, 3D, and 4D. Most surprisingly, we find that if the relation 'partially overlaps' is disallowed, all consistent atomic RCC8 networks can be convexly realized in 4D. If instead refinements of the relation 'part of' are disallowed, all consistent atomic RCC8 relations can be convexly realized in 3D. We furthermore show, among others, that any consistent RCC8 network with 2n+1 variables can be realized using convex regions in the n-dimensional Euclidean space.
Presence-absence reasoning for evolutionary phenotypes
Balhoff, James P., Dececchi, T. Alexander, Mabee, Paula M., Lapp, Hilmar
Nearly invariably, phenotypes are reported in the scientific literature in meticulous detail, utilizing the full expressivity of natural language. Often it is particularly these detailed observations (facts) that are of interest, and thus specific to the research questions that motivated observing and reporting them. However, research aiming to synthesize or integrate phenotype data across many studies or even fields is often faced with the need to abstract from detailed observations so as to construct phenotypic concepts that are common across many datasets rather than specific to a few. Yet, observations or facts that would fall under such abstracted concepts are typically not directly asserted by the original authors, usually because they are "obvious" according to common domain knowledge, and thus asserting them would be deemed redundant by anyone with sufficient domain knowledge. For example, a phenotype describing the length of a manual digit for an organism implicitly means that the organism must have had a hand, and thus a forelimb; the presence or absence of a forelimb may have supporting data across a far wider range of taxa than the length of a particular manual digit. Here we describe how within the Phenoscape project we use a pipeline of OWL axiom generation and reasoning steps to infer taxon-specific presence/absence of anatomical entities from anatomical phenotypes. Although presence/absence is all but one, and a seemingly simple way to abstract phenotypes across data sources, it can nonetheless be powerful for linking genotype to phenotype, and it is particularly relevant for constructing synthetic morphological supermatrices for comparative analysis; in fact presence/absence is one of the prevailing character observation types in published character matrices.
Speeding Up Iterative Ontology Alignment using Block-Coordinate Descent
In domains such as biomedicine, ontologies are prominently utilized for annotating data. Consequently, aligning ontologies facilitates integrating data. Several algorithms exist for automatically aligning ontologies with diverse levels of performance. As alignment applications evolve and exhibit online run time constraints, performing the alignment in a reasonable amount of time without compromising the quality of the alignment is a crucial challenge. A large class of alignment algorithms is iterative and often consumes more time than others in delivering solutions of high quality. We present a novel and general approach for speeding up the multivariable optimization process utilized by these algorithms. Specifically, we use the technique of block-coordinate descent (BCD), which exploits the subdimensions of the alignment problem identified using a partitioning scheme. We integrate this approach into multiple well-known alignment algorithms and show that the enhanced algorithms generate similar or improved alignments in significantly less time on a comprehensive testbed of ontology pairs. Because BCD does not overly constrain how we partition or order the parts, we vary the partitioning and ordering schemes in order to empirically determine the best schemes for each of the selected algorithms. As biomedicine represents a key application domain for ontologies, we introduce a comprehensive biomedical ontology testbed for the community in order to evaluate alignment algorithms. Because biomedical ontologies tend to be large, default iterative techniques find it difficult to produce a good quality alignment within a reasonable amount of time. We align a significant number of ontology pairs from this testbed using BCD-enhanced algorithms. Our contributions represent an important step toward making a significant class of alignment techniques computationally feasible.
$OntoMath^{PRO}$ Ontology: A Linked Data Hub for Mathematics
Nevzorova, Olga, Zhiltsov, Nikita, Kirillovich, Alexander, Lipachev, Evgeny
In this paper, we present an ontology of mathematical knowledge concepts that covers a wide range of the fields of mathematics and introduces a balanced representation between comprehensive and sensible models. We demonstrate the applications of this representation in information extraction, semantic search, and education. We argue that the ontology can be a core of future integration of math-aware data sets in the Web of Data and, therefore, provide mappings onto relevant datasets, such as DBpedia and ScienceWISE.
An Ontology for Open 311 Data
Nalchigar, Soroosh (University of Toronto) | Fox, Mark S. (University of Toronto)
A major challenge in the analysis of city data is the integration of data from different sources. This paper defines an ontology, called Open 311 Ontology, that provides a unified terminology and a reference model for representing the 311 data. We illustrate how the ontology can be used to map and integrate data from multiple cities, and for answering competency questions.
The D-SCRIBE Process for Building a Scalable Ontology
Schloss, Robert (IBM T. J. Watson Research Center) | Usceda-Sosa, Rosario (IBM T. J. Watson Research Center) | Srivastava, Biplav (IBM Research - India)
In this paper, we describe the D-SCRIBE process used to build ontologies that are expected to have significant domain expansion after their initial introduction and whose coverage of concepts needs to be validated for a series of related applications. This process has been used to build SCRIBE, a very modular, ambitious ontology for the information about events triggered by both humans or nature, response activities by agencies that provide public services in cities by using resources and assets (land parcels, buildings, vehicles, equipment) and their communication (requests, work orders, sensor reports). SCRIBE reuses concepts from previously existing ontologies and data exchange standards, and D-SCRIBE retains traceability to these source influences.
Foundation Ontologies Requirements for Global City Indicators
Fox, Mark S. (University of Toronto)
City Indicators are metrics used to measure city per- formance. Global City Indicators, as developed by the Global Cities Institute at the University of Toronto, are metrics that have been agreed to by over 250 cities world wide and have been approved as ISO 37120. The definitions of the indicators exist only in written form. The purpose of this research is to provide an ontology for representing the definition of these indi- cators and their instantiation by cities worldwide so that they can shared across the Semantic Web. This paper describes the requirements for the ontology and provides an example of its use.
An Ontology for Ecological Urbanism. SUM+Ecology
Cormenzana, Berta (BCN Ecologia) | Fabregas, Ferran (BCN Ecologia) | Marinescu, Maria-Cristina (Barcelona Supercomputing Center) | Marrero, Monica (Barcelona Supercomputing center) | Rueda, Salvador (BCN Ecologia) | Uceda-Sosa, Rosario (IBM Research)
As the complexity and abundance of city data increases, reusable semantic models that can integrate heterogeneous data sources in a lightweight manner enable a holistic view of the city data, which is key to Urban Ecology. Our multi-disciplinary team has built an ontology for Urban Ecology that not only captures a field-validated urban model and certification process, but also explores the reuse of semantic models and their interaction with domain experts.
Semantically Integrating Biomedical Databases to Support Inference
Livingston, Kevin M. (University of Colorado) | Bada, Michael (University of Colorado) | Baumgartner, William A. (University of Colorado) | Hunter, Lawrence E. (University of Colorado)
We have built KaBOB (Knowledge Base of Biomedicine) by integrating information from over 20 existing biomedical data sources about humans and seven major model organisms. The knowledge base is modeled in OWL and grounded in 14 prominent OBOs (Open Biomedical Ontologies). It is comprised of over 419 million RDF triples. Queries can be posed to KaBOB in terms of biomedical concepts and abstractions, instead of requiring knowledge of source-specific encodings and terminology.