Materializing Inferred and Uncertain Knowledge in RDF Datasets

AAAI Conferences

There is a growing need for efficient and scalable semantic web queries that handle inference. There is also a growing interest in representing uncertainty in semantic web knowledge bases. In this paper, we present a bit vector schema specifically designed for RDF (Resource Description Framework) datasets. We propose a system for materializing and storing inferred knowledge using this schema. We show experimental results that demonstrate that our solution drastically improves the performance of inference queries. We also propose a solution for materializing uncertain information and probabilities using multiple bit vectors and thresholds.

Materializing and Persisting Inferred and Uncertain Knowledge in RDF Datasets

AAAI Conferences

As the semantic web grows in popularity and enters the mainstream of computer technology, RDF (Resource Description Framework) datasets are becoming larger and more complex. Advanced semantic web ontologies, especially in medicine and science, are developing. As more complex ontologies are developed, there is a growing need for efficient queries that handle inference. In areas such as research, it is vital to be able to perform queries that retrieve not just facts but also inferred knowledge and uncertain information. OWL (Web Ontology Language) defines rules that govern provable inference in semantic web datasets. In this paper, we detail a database schema using bit vectors that is designed specifically for RDF datasets. We introduce a framework for materializing and storing inferred triples. Our bit vector schema enables storage of inferred knowledge without a query performance penalty. Inference queries are simplified and performance is improved. Our evaluation results demonstrate that our inference solution is more scalable and efficient than the current state-of-the-art. There are also standards being developed for representing probabilistic reasoning within OWL ontologies. We specify a framework for materializing uncertain information and probabilities using these ontologies. We define a multiple vector schema for representing probabilities and classifying uncertain knowledge using thresholds. This solution increases the breadth of information that can be efficiently retrieved.

Cross-Lingual Entity Linking for Web Tables

AAAI Conferences

This paper studies the problem of linking string mentions from web tables in one language to the corresponding named entities in a knowledge base written in another language, which we call the cross-lingual table linking task. We present a joint statistical model to simultaneously link all mentions that appear in one table. The framework is based on neural networks, aiming to bridge the language gap by vector space transformation and a coherence feature that captures the correlations between entities in one table. Experimental results report that our approach improves the accuracy of cross-lingual table linking by a relative gain of 12.1%. Detailed analysis of our approach also shows a positive and important gain brought by the joint framework and coherence feature.

Dynamic Bayesian Networks with Deterministic Latent Tables

Neural Information Processing Systems

The application of latent/hidden variable Dynamic Bayesian Networks isconstrained by the complexity of marginalising over latent variables. For this reason either small latent dimensions or Gaussian latentconditional tables linearly dependent on past states are typically considered in order that inference is tractable. We suggest an alternative approach in which the latent variables are modelled using deterministic conditional probability tables.

An Orthonormal Basis for Entailment

AAAI Conferences

Entailment is a logical relationship in which the truth of a proposition implies the truth of another proposition. The ability to detect entailment has applications in IR, QA, and many other areas. This study uses the vector space model to explore the relationship between cohesion and entailment. A Latent Semantic Analysis space is used to measure cohesion between propositions using vector similarity.