Ontologies
Discovering Beaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains
Walk, Simon, Singer, Philipp, Strohmaier, Markus, Tudorache, Tania, Musen, Mark A., Noy, Natalya F.
Biomedical taxonomies, thesauri and ontologies in the form of the International Classification of Diseases (ICD) as a taxonomy or the National Cancer Institute Thesaurus as an OWL-based ontology, play a critical role in acquiring, representing and processing information about human health. With increasing adoption and relevance, biomedical ontologies have also significantly increased in size. For example, the 11th revision of the ICD, which is currently under active development by the WHO contains nearly 50,000 classes representing a vast variety of different diseases and causes of death. This evolution in terms of size was accompanied by an evolution in the way ontologies are engineered. Because no single individual has the expertise to develop such large-scale ontologies, ontology-engineering projects have evolved from small-scale efforts involving just a few domain experts to large-scale projects that require effective collaboration between dozens or even hundreds of experts, practitioners and other stakeholders. Understanding how these stakeholders collaborate will enable us to improve editing environments that support such collaborations. We uncover how large ontology-engineering projects, such as the ICD in its 11th revision, unfold by analyzing usage logs of five different biomedical ontology-engineering projects of varying sizes and scopes using Markov chains. We discover intriguing interaction patterns (e.g., which properties users subsequently change) that suggest that large collaborative ontology-engineering projects are governed by a few general principles that determine and drive development. From our analysis, we identify commonalities and differences between different projects that have implications for project managers, ontology editors, developers and contributors working on collaborative ontology-engineering projects and tools in the biomedical domain.
Module Extraction in Expressive Ontology Languages via Datalog Reasoning
Armas Romero, Ana, Kaminski, Mark, Cuenca Grau, Bernardo, Horrocks, Ian
Module extraction is the task of computing a (preferably small) fragment M of an ontology T that preserves a class of entailments over a signature of interest S. Extracting modules of minimal size is well-known to be computationally hard, and often algorithmically infeasible, especially for highly expressive ontology languages. Thus, practical techniques typically rely on approximations, where M provably captures the relevant entailments, but is not guaranteed to be minimal. Existing approximations ensure that M preserves all second-order entailments of T w.r.t. S, which is a stronger condition than is required in many applications, and may lead to unnecessarily large modules in practice. In this paper we propose a novel approach in which module extraction is reduced to a reasoning problem in datalog. Our approach generalises existing approximations in an elegant way. More importantly, it allows extraction of modules that are tailored to preserve only specific kinds of entailments, and thus are often significantly smaller. Our evaluation on a wide range of ontologies confirms the feasibility and benefits of our approach in practice.
Ontology Re-Engineering: A Case Study from the Automotive Industry
Rychtyckyj, Nestor (Ford Motor Company) | Raman, Venkatesh (Ford Motor Company) | Sankaranarayanan, Baskaran (Indian Institute of Technology Madras) | Kumar, P. Sreenivasa (Indian Institute of Technology Madras) | Khemani, Deepak (Indian Institute of Technology Madras)
For over twenty five years Ford has been utilizing an AI-based system to manage process planning for vehicle assembly at our assembly plants around the world. The scope of the AI system, known originally as the Direct Labor Management System and now as the Global Study Process Allocation System (GSPAS),has increased over the years to include additional functionality on Ergonomics and Powertrain Assembly (Engine and Transmission plants). The knowledge about Ford’s manufacturing processes is contained in an ontology originally developed using the KL-ONE representation language and methodology. To preserve the viability of the GSPAS ontology and to make it easily usable for other applications within Ford, we needed to re-engineer and convert the KL-ONE ontology into a semantic web OWL/RDF format. In this paper, we will discuss the process by which we re-engineered the existing GSPAS KL-ONE ontology and deployed semantic web technology in our application.
Effectiveness of Automatic Translations for Cross-Lingual Ontology Mapping
Abu Helou, Mamoun, Palmonari, Matteo, Jarrar, Mustafa
Accessing or integrating data lexicalized in different languages is a challenge. Multilingual lexical resources play a fundamental role in reducing the language barriers to map concepts lexicalized in different languages. In this paper we present a large-scale study on the effectiveness of automatic translations to support two key cross-lingual ontology mapping tasks: the retrieval of candidate matches and the selection of the correct matches for inclusion in the final alignment. We conduct our experiments using four different large gold standards, each one consisting of a pair of mapped wordnets, to cover four different families of languages. We categorize concepts based on their lexicalization (type of words, synonym richness, position in a subconcept graph) and analyze their distributions in the gold standards. Leveraging this categorization, we measure several aspects of translation effectiveness, such as word-translation correctness, word sense coverage, synset and synonym coverage. Finally, we thoroughly discuss several findings of our study, which we believe are helpful for the design of more sophisticated cross-lingual mapping algorithms.
Test-Driven Development of ontologies (extended version)
Keet, C. Maria, Lawrynowicz, Agnieszka
Emerging ontology authoring methods to add knowledge to an ontology focus on ameliorating the validation bottleneck. The verification of the newly added axiom is still one of trying and seeing what the reasoner says, because a systematic testbed for ontology authoring is missing. We sought to address this by introducing the approach of test-driven development for ontology authoring. We specify 36 generic tests, as TBox queries and TBox axioms tested through individuals, and structure their inner workings in an `open box'-way, which cover the OWL 2 DL language features. This is implemented as a Protege plugin so that one can perform a TDD test as a black box test. We evaluated the two test approaches on their performance. The TBox queries were faster, and that effect is more pronounced the larger the ontology is. We provide a general sequence of a TDD process for ontology engineering as a foundation for a TDD methodology.
Pay-As-You-Go Description Logic Reasoning by Coupling Tableau and Saturation Procedures
Steigmiller, Andreas, Glimm, Birte
Nowadays, saturation-based reasoners for the OWL EL profile of the Web Ontology Language are able to handle large ontologies such as SNOMED very efficiently. However, it is currently unclear how saturation-based reasoning procedures can be extended to very expressive Description Logics such as SROIQ--the logical underpinning of the current and second iteration of the Web Ontology Language. Tableau-based procedures, on the other hand, are not limited to specific Description Logic languages or OWL profiles, but even highly optimised tableau-based reasoners might not be efficient enough to handle large ontologies such as SNOMED. In this paper, we present an approach for tightly coupling tableau- and saturation-based procedures that we implement in the OWL DL reasoner Konclude. Our detailed evaluation shows that this combination significantly improves the reasoning performance for a wide range of ontologies.
PAGOdA: Pay-As-You-Go Ontology Query Answering Using a Datalog Reasoner
Zhou, Yujiao, Cuenca Grau, Bernardo, Nenov, Yavor, Kaminski, Mark, Horrocks, Ian
Answering conjunctive queries over ontology-enriched datasets is a core reasoning task for many applications. Query answering is, however, computationally very expensive, which has led to the development of query answering procedures that sacrifice either expressive power of the ontology language, or the completeness of query answers in order to improve scalability. In this paper, we describe a hybrid approach to query answering over OWL 2 ontologies that combines a datalog reasoner with a fully-fledged OWL 2 reasoner in order to provide scalable `pay-as-you-go' performance. The key feature of our approach is that it delegates the bulk of the computation to the datalog reasoner and resorts to expensive OWL 2 reasoning only as necessary to fully answer the query. Furthermore, although our main goal is to efficiently answer queries over OWL 2 ontologies and data, our technical results are very general and our approach is applicable to first-order knowledge representation languages that can be captured by rules allowing for existential quantification and disjunction in the head; our only assumption is the availability of a datalog reasoner and a fully-fledged reasoner for the language of interest, both of which are used as `black boxes'. We have implemented our techniques in the PAGOdA system, which combines the datalog reasoner RDFox and the OWL 2 reasoner HermiT. Our extensive evaluation shows that PAGOdA succeeds in providing scalable pay-as-you-go query answering for a wide range of OWL 2 ontologies, datasets and queries.
Information retrieval in folktales using natural language processing
Recognising literary characters in various narrative texts is challenging both from the literary and technical perspective. From the literary viewpoint, the meaning of the term "character" leaves space to various interpretations. From the technical perspective, literary texts contain a lot of data about emotions, social life or inner life of the characters, while they are very thin on technical, straightforward messages. To infer the character type from literary texts might pose problems even to the human readers [4]. Interactions between literary characters contain rich social networks.
SWISH: SWI-Prolog for Sharing
Wielemaker, Jan, Lager, Torbjörn, Riguzzi, Fabrizio
Recently, we see a new type of interfaces for programmers based on web technology. For example, JSFiddle, IPython Notebook and R-studio. Web technology enables cloud-based solutions, embedding in tutorial web pages, atractive rendering of results, web-scale cooperative development, etc. This article describes SWISH, a web front-end for Prolog. A public website exposes SWI-Prolog using SWISH, which is used to run small Prolog programs for demonstration, experimentation and education. We connected SWISH to the ClioPatria semantic web toolkit, where it allows for collaborative development of programs and queries related to a dataset as well as performing maintenance tasks on the running server and we embedded SWISH in the Learn Prolog Now! online Prolog book.
MARTHA Speaks: Implementing Theory of Mind for More Intuitive Communicative Acts
Gmytrasiewicz, Piotr (Univeristy of Illinois at Chicago) | Moe, George Herbert (Illinois Mathematics and Science Academy) | Moreno, Adolfo (University of Illinois at Chicago)
The theory of mind is an important human capability that allows us to understand and predict the goals, intents, and beliefs of other individuals. We present an approach to designing intelligent communicative agents based on modeling theories of mind. This can be tricky because other agents may also have their own theories of mind of the first agent, meaning that these mental models are naturally nested in layers. So, to look for intuitive communicative acts, we recursively apply a planning algorithm in each of these nested layers, looking for possible plans of action as well as their hypothetical consequences, which include the reactions of other agents; we propose that truly intelligent communicative acts are the ones which produce a state of maximum decision theoretic utility according to the entire theory of mind. We implement these ideas using Java and OpenCyc in an attempt to create an assistive AI we call MARTHA. We demonstrate MARTHA's capabilities with two motivating examples: helping the user buy a sandwich and helping the user search for an activity. We see that, in addition to being a personal assistant, MARTHA can be extended to other assistive fields, such as finance, research, and government.