There is a growing need for scalable semantic web repositories which support inference and provide efficient queries. There is also a growing interest in representing uncertain knowledge in semantic web datasets and ontologies. In this paper, I present a bit vector schema specifically designed for RDF (Resource Description Framework) datasets. I propose a system for materializing and storing inferred knowledge using this schema. I show experimental results that demonstrate that this solution simplifies inference queries and drastically improves results. I also propose and describe a solution for materializing and persisting uncertain information and probabilities. Thresholds and bit vectors are used to provide efficient query access to this uncertain knowledge. My goal is to provide a semantic web repository that supports knowledge inference, uncertainty reasoning, and Bayesian networks, without sacrificing performance or scalability.
The most widely accepted defining feature of the semantic web is machine-usable content. By this definition, the semantic web is already manifest in shopping agents that automatically access and use web content to find the lowest air fares or book prices. However, where are the semantics? Most people regard the semantic web as a vision, not a reality -- so shopping agents should not "count." To use web content, machines need to know what to do when they encounter it, which, in turn, requires the machine to know what the content means (that is, its semantics). The challenge of developing the semantic web is how to put this knowledge into the machine. The manner in which it is done is at the heart of the confusion about the semantic web. The goal of this article is to clear up some of this confusion. I explain that shopping agents work in the complete absence of any explicit account of the semantics of web content because the meaning of the web content that the agents are expected to encounter can be determined by the human programmers who hardwire it into the web application software. I therefore regard shopping agents as a degenerate case of the semantic web. I note various shortcomings of this approach. I conclude by presenting some ideas about how the semantic web will likely evolve.
To appear in Theory and Practice of Logic Programming (TPLP), 2008. We are researching the interaction between the rule and the ontology layers of the Semantic Web, by comparing two options: 1) using OWL and its rule extension SWRL to develop an integrated ontology/rule language, and 2) layering rules on top of an ontology with RuleML and OWL. Toward this end, we are developing the SWORIER system, which enables efficient automated reasoning on ontologies and rules, by translating all of them into Prolog and adding a set of general rules that properly capture the semantics of OWL. We have also enabled the user to make dynamic changes on the fly, at run time. This work addresses several of the concerns expressed in previous work, such as negation, complementary classes, disjunctive heads, and cardinality, and it discusses alternative approaches for dealing with inconsistencies in the knowledge base. In addition, for efficiency, we implemented techniques called extensionalization, avoiding reanalysis, and code minimization.
Identity relations are at the foundation of many logic-based knowledge representations. We argue that the traditional notion of equality, is unsuited for many realistic knowledge representation settings. The classical interpretation of equality is too strong when the equality statements are re-used outside their original context. On the Semantic Web, equality statements are used to interlink multiple descriptions of the same object, using owl:sameAs assertions. And indeed, many practical uses of owl:sameAs are known to violate the formal Leibniz-style semantics. We provide a more flexible semantics to identity by assigning meaning to the subrelations of an identity relation in terms of the predicates that are used in a knowledge-base. Using those indiscernability-predicates, we define upper and lower approximations of equality in the style of rought-set theory, resulting in a quality-measure for identity relations.
The Semantic Web language RDF was designed to unambiguously define and use ontologies to encode data and knowledge on the Web. Many people find it difficult, however, to write complex RDF statements and queries because doing so requires familiarity with the appropriate ontologies and the terms they define. We describe a system that suggests appropriate RDF terms given semantically related English words and general domain and context information. We use the Swoogle Semantic Web search engine to provide RDF term and namespace statistics, the WorldNet lexical ontology to find semantically related words, and a naïve Bayes classifier to suggest terms. A customized graph data structure of related namespaces is constructed from Swoogle's database to speed up the classifier model learning and prediction time.