Rough Set Semantics for Identity on the Web

AAAI Conferences

Identity relations are at the foundation of the Linked Open Data initiative and on the Semantic Web in gen- eral. They allow the interlinking of alternative descrip- tions of the same thing. However, many practical uses of owl:sameAs are known to violate its formal seman- tics. We propose a method that assigns meaning to (the subrelations of) an identity relation using the predicates of the dataset schema. Applications of this approach include automated suggestions for asserting/retracting identity pairs and quality assessment. We also describe an experimental design for this approach.


The sameAs Problem: A Survey on Identity Management in the Web of Data

arXiv.org Artificial Intelligence

In a decentralised knowledge representation system such as the W eb of Data, it is common and indeed desirable for different knowledge graphs to overlap. Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Whilst the deductive value of such identity statements can be extremely useful in enhancing various knowledge-based systems, incorrect use of identity can have wide-ranging effects in a global knowledge space like the W eb of Data. With several works already proven that identity in the W eb is broken, this survey investigates the current state of this "sameAs problem". An open discussion highlights the main weaknesses suffered by solutions in the literature, and draws open challenges to be faced in the future.


On Kinds of Indiscernibility in Logic and Metaphysics

arXiv.org Artificial Intelligence

Using the Hilbert-Bernays account as a spring-board, we first define four ways in which two objects can be discerned from one another, using the non-logical vocabulary of the language concerned. (These definitions are based on definitions made by Quine and Saunders.) Because of our use of the Hilbert-Bernays account, these definitions are in terms of the syntax of the language. But we also relate our definitions to the idea of permutations on the domain of quantification, and their being symmetries. These relations turn out to be subtle---some natural conjectures about them are false. We will see in particular that the idea of symmetry meshes with a species of indiscernibility that we will call `absolute indiscernibility'. We then report all the logical implications between our four kinds of discernibility. We use these four kinds as a resource for stating four metaphysical theses about identity. Three of these theses articulate two traditional philosophical themes: viz. the principle of the identity of indiscernibles (which will come in two versions), and haecceitism. The fourth is recent. Its most notable feature is that it makes diversity (i.e. non-identity) weaker than what we will call individuality (being an individual): two objects can be distinct but not individuals. For this reason, it has been advocated both for quantum particles and for spacetime points. Finally, we locate this fourth metaphysical thesis in a broader position, which we call structuralism. We conclude with a discussion of the semantics suitable for a structuralist, with particular reference to physical theories as well as elementary model theory.


Beek

AAAI Conferences

Identity relations are at the foundation of many logic-based knowledge representations. We argue that the traditional notion of equality, is unsuited for many realistic knowledge representation settings. The classical interpretation of equality is too strong when the equality statements are re-used outside their original context. On the Semantic Web, equality statements are used to interlink multiple descriptions of the same object, using owl:sameAs assertions. And indeed, many practical uses of owl:sameAs are known to violate the formal Leibniz-style semantics. We provide a more flexible semantics to identity by assigning meaning to the subrelations of an identity relation in terms of the predicates that are used in a knowledge-base. Using those indiscernability-predicates, we define upper and lower approximations of equality in the style of rought-set theory, resulting in a quality-measure for identity relations.


Not Quite the Same: Identity Constraints for the Web of Linked Data

AAAI Conferences

Linked Data is based on the idea that information from different sources can flexibly be connected to enable novel applications that individual datasets do not support on their own. This hinges upon the existence of links between datasets that would otherwise be isolated. The most notable form, sameAs links, are intended to express that two identifiers are equivalent in all respects. Unfortunately, many existing ones do not reflect such genuine identity. This study provides a novel method to analyse this phenomenon, based on a thorough theoretical analysis, as well as a novel graph-based method to resolve such issues to some extent. Our experiments on a representative Web-scale set of sameAs links from the Web of Data show that our method can identify and remove hundreds of thousands of constraint violations. Linked Data is based on the idea that information from different sources can flexibly be connected to enable novel applications that individual datasets do not support on their own. This hinges upon the existence of links between datasets that would otherwise be isolated. The most notable form, sameAs links, are intended to express that two identifiers are equivalent in all respects. Unfortunately, many existing ones do not reflect such genuine identity. This study provides a novel method to analyse this phenomenon, based on a thorough theoretical analysis, as well as a novel graph-based method to resolve such issues to some extent. Our experiments on a representative Web-scale set of sameAs links from the Web of Data show that our method can identify and remove hundreds of thousands of constraint violations.