Goto

Collaborating Authors

 Ontologies


Determining Multifunctional Genes and Diseases in Human Using Gene Ontology

arXiv.org Artificial Intelligence

GO has been diagnostics and drug discovery. In this paper, we further extensively used to compute the similarity between genes our previous study on gene-disease relationship (details in section 3) [19, 20]. In this work, we use the specifically with the multifunctional genes. We investigate functional annotations of a gene from the Gene Ontology the multifunctional gene-disease relationship based on the Annotation (GOA) databases to compute the shortest published molecular function annotations of genes from distance (path length) between the Molecular Function the Gene Ontology which is the most comprehensive (mf) GO terms annotating the gene.


PatientEG Dataset: Bringing Event Graph Model with Temporal Relations to Electronic Medical Records

arXiv.org Artificial Intelligence

Medical activities, such as diagnoses, medicine treatments, and laboratory tests, as well as temporal relations between these activities are the basic concepts in clinical research. However, existing relational data model on electronic medical records (EMRs) lacks explicit and accurate semantic definitions of these concepts. It leads to the inconvenience of query construction and the inefficiency of query execution where multi-table join queries are frequently required. In this paper, we propose a patient event graph (PatientEG) model to capture the characteristics of EMRs. We respectively define five types of medical entities, five types of medical events and five types of temporal relations. Based on the proposed model, we also construct a PatientEG dataset with 191,294 events, 3,429 distinct entities, and 545,993 temporal relations using EMRs from Shanghai Shuguang hospital. To help to normalize entity values which contain synonyms, hyponymies, and abbreviations, we link them with the Chinese biomedical knowledge graph. With the help of PatientEG dataset, we are able to conveniently perform complex queries for clinical research such as auxiliary diagnosis and therapeutic effectiveness analysis. In addition, we provide a SPARQL endpoint to access PatientEG dataset and the dataset is also publicly available online. Also, we list several illustrative SPARQL queries on our website.


SMILK, linking natural language and data from the web

arXiv.org Artificial Intelligence

As part of the SMILK Joint Lab, we studied the use of Natural Language Processing to: (1) enrich knowledge bases and link data on the web, and conversely (2) use this linked data to contribute to the improvement of text analysis and the annotation of textual content, and to support knowledge extraction. The evaluation focused on brand-related information retrieval in the field of cosmetics. This article describes each step of our approach: the creation of ProVoc, an ontology to describe products and brands; the automatic population of a knowledge base mainly based on ProVoc from heterogeneous textual resources; and the evaluation of an application which that takes the form of a browser plugin providing additional knowledge to users browsing the web.


More Effective Ontology Authoring with Test-Driven Development

arXiv.org Artificial Intelligence

Faculty of Computing, Poznan University of Technology, Poland, agnieszka.lawrynowicz@cs.put.poznan.pl Abstract Ontology authoring is a complex process, where commonly the automated reasoner is invoked for verification of newly introduced changes, therewith amounting to a time-consuming test-last approach. Test-Driven Development (TDD) for ontology authoring is a recent test-first approach that aims to reduce authoring time and increase authoring efficiency. Current TDD testing falls short on coverage of OWL features and possible test outcomes, the rigorous foundation thereof, and evaluations to ascertain its effectiveness. We aim to address these issues in one instantiation of TDD for ontology authoring. We first propose a succinct, logic-based model of TDD testing and present novel TDD algorithms so as to cover also any OWL 2 class expression for the TBox and for the principal ABox assertions, and prove their correctness. The algorithms use methods from the OWL API directly such that reclassification is not necessary for test execution, therewith reducing ontology authoring time. The algorithms were implemented in TDDonto2, a Protégé plugin. TDDonto2 was evaluated on editing efficiency and by users. The editing efficiency study demonstrated that it is faster than a typical ontology authoring interface, especially for medium size and large ontologies. The user evaluation demonstrated that modellers make significantly less errors with TDDonto2 compared to the standard Protégé interface and complete their tasks better using less time. Thus, the results indicate that Test-Driven Development is a promising approach in an ontology development methodology. Keywords:Ontology Engineering, Test-Driven Development, OWL 1. Introduction Ontology engineering is facilitated by methods and methodologies, andtooling support for them. The methodologies are mostly information system-like, high-level directions, such as variants on waterfall and lifecycle development [1, 2], although more recently, notions of Agile development are being ported to the ontology development setting, e.g., [3, 4], including testing in some form [5, 6, 7, 8].


Improved Knowledge Graph Embedding using Background Taxonomic Information

arXiv.org Machine Learning

Knowledge graphs are used to represent relational information in terms of triples. To enable learning about domains, embedding models, such as tensor factorization models, can be used to make predictions of new triples. Often there is background taxonomic information (in terms of subclasses and subproperties) that should also be taken into account. We show that existing fully expressive (a.k.a. universal) models cannot provably respect subclass and subproperty information. We show that minimal modifications to an existing knowledge graph completion method enables injection of taxonomic information. Moreover, we prove that our model is fully expressive, assuming a lower-bound on the size of the embeddings. Experimental results on public knowledge graphs show that despite its simplicity our approach is surprisingly effective.


Efficient Concept Induction for Description Logics

arXiv.org Artificial Intelligence

Concept Induction refers to the problem of creating complex Description Logic class descriptions (i.e., TBox axioms) from instance examples (i.e., ABox data). In this paper we look particularly at the case where both a set of positive and a set of negative instances are given, and complex class expressions are sought under which the positive but not the negative examples fall. Concept induction has found applications in ontology engineering, but existing algorithms have fundamental performance issues in some scenarios, mainly because a high number of invokations of an external Description Logic reasoner is usually required. In this paper we present a new algorithm for this problem which drastically reduces the number of reasoner invokations needed. While this comes at the expense of a more limited traversal of the search space, we show that our approach improves execution times by up to several orders of magnitude, while output correctness, measured in the amount of correct coverage of the input instances, remains reasonably high in many cases. Our approach thus should provide a strong alternative to existing systems, in particular in settings where other systems are prohibitively slow.


Project Rosetta: A Childhood Social, Emotional, and Behavioral Developmental Ontology

arXiv.org Artificial Intelligence

There is a wide array of existing instruments used to assess childhood behavior and development for the evaluation of social, emotional and behavioral disorders. Many of these instruments either focus on one diagnostic category or encompass a broad set of childhood behaviors. We built an extensive ontology of the questions associated with key features that have diagnostic relevance for child behavioral conditions, such as Autism Spectrum Disorder (ASD), attention-deficit/hyperactivity disorder (ADHD), and anxiety, by incorporating a subset of existing child behavioral instruments and categorizing each question into clinical domains. Each existing question and set of question responses were then mapped to a new unique Rosetta question and set of answer codes encompassing the semantic meaning and identified concept(s) of as many existing questions as possible. This resulted in 1274 existing instrument questions mapping to 209 Rosetta questions creating a minimal set of questions that are comprehensive of each topic and subtopic. This resulting ontology can be used to create more concise instruments across various ages and conditions, as well as create more robust overlapping datasets for both clinical and research use.


Constructing Ontology-Based Cancer Treatment Decision Support System with Case-Based Reasoning

arXiv.org Artificial Intelligence

Decision support is a probabilistic and quantitative method designed for modeling problems in situations with ambiguity. Computer technology can be employed to provide clinical decision support and treatment recommendations. The problem of natural language applications is that they lack formality and the interpretation is not consistent. Conversely, ontologies can capture the intended meaning and specify modeling primitives. Disease Ontology (DO) that pertains to cancer's clinical stages and their corresponding information components is utilized to improve the reasoning ability of a decision support system (DSS). The proposed DSS uses Case-Based Reasoning (CBR) to consider disease manifestations and provides physicians with treatment solutions from similar previous cases for reference. The proposed DSS supports natural language processing (NLP) queries. The DSS obtained 84.63% accuracy in disease classification with the help of the ontology.


SWRL2SPIN: A tool for transforming SWRL rule bases in OWL ontologies to object-oriented SPIN rules

arXiv.org Artificial Intelligence

Semantic Web Rule Language (SWRL) combines OWL (Web Ontology Language) ontologies with Horn Logic rules of the Rule Markup Language (RuleML) family. Being supported by ontology editors, rule engines and ontology reasoners, it has become a very popular choice for developing rule-based applications on top of ontologies. However, SWRL is probably not go-ing to become a WWW Consortium standard, prohibiting industrial acceptance. On the other hand, SPIN (SPARQL Inferencing Notation) has become a de-facto industry standard to rep-resent SPARQL rules and constraints on Semantic Web models, building on the widespread acceptance of SPARQL (SPARQL Protocol and RDF Query Language). In this paper, we ar-gue that the life of existing SWRL rule-based ontology applications can be prolonged by con-verting them to SPIN. To this end, we have developed the SWRL2SPIN tool in Prolog that transforms SWRL rules into SPIN rules, considering the object-orientation of SPIN, i.e. linking rules to the appropriate ontology classes and optimizing them, as derived by analysing the rule conditions.


Consequence-Based Reasoning for Description Logics with Disjunctions and Number Restrictions

Journal of Artificial Intelligence Research

Classification of description logic (DL) ontologies is a key computational problem in modern data management applications, so considerable effort has been devoted to the development and optimisation of practical reasoning calculi. Consequence-based calculi combine ideas from hypertableau and resolution in a way that has proved very effective in practice. However, existing consequence-based calculi can handle either Horn DLs (which do not support disjunction) or DLs without number restrictions. In this paper, we overcome this important limitation and present the first consequence-based calculus for deciding concept subsumption in the DL ALCHIQ+. Our calculus runs in exponential time assuming unary coding of numbers, and on ELH ontologies it runs in polynomial time. The extension to disjunctions and number restrictions is technically involved: we capture the relevant consequences using first-order clauses, and our inference rules adapt paramodulation techniques from first-order theorem proving. By using a well-known preprocessing step, the calculus can also decide concept subsumptions in SRIQ---a rich DL that covers all features of OWL 2 DL apart from nominals and datatypes. We have implemented our calculus in a new reasoner called Sequoia. We present the architecture of our reasoner and discuss several novel and important implementation techniques such as clause indexing and redundancy elimination. Finally, we present the results of an extensive performance evaluation, which revealed Sequoia to be competitive with existing reasoners. Thus, the calculus and the techniques we present in this paper provide an important addition to the repertoire of practical implementation techniques for description logic reasoning.