Goto

Collaborating Authors

 Ontologies


An innovative solution for breast cancer textual big data analysis

arXiv.org Machine Learning

The digitalization of stored information in hospitals now allows for the exploitation of medical data in text format, as electronic health records (EHRs), initially gathered for other purposes than epidemiology. Manual search and analysis operations on such data become tedious. In recent years, the use of natural language processing (NLP) tools was highlighted to automatize the extraction of information contained in EHRs, structure it and perform statistical analysis on this structured information. The main difficulties with the existing approaches is the requirement of synonyms or ontology dictionaries, that are mostly available in English only and do not include local or custom notations. In this work, a team composed of oncologists as domain experts and data scientists develop a custom NLP-based system to process and structure textual clinical reports of patients suffering from breast cancer. The tool relies on the combination of standard text mining techniques and an advanced synonym detection method. It allows for a global analysis by retrieval of indicators such as medical history, tumor characteristics, therapeutic responses, recurrences and prognosis. The versatility of the method allows to obtain easily new indicators, thus opening up the way for retrospective studies with a substantial reduction of the amount of manual work. With no need for biomedical annotators or pre-defined ontologies, this language-agnostic method reached an good extraction accuracy for several concepts of interest, according to a comparison with a manually structured file, without requiring any existing corpus with local or new notations.


WNtags: A Web-Based Tool For Image Labeling And Retrieval With Lexical Ontologies

arXiv.org Artificial Intelligence

Ever growing number of image documents available on the Internet continuously motivates research in better annotation models and more efficient retrieval methods. Formal knowledge representation of objects and events in pictures, their interaction as well as context complexity becomes no longer an option for a quality image repository, but a necessity. We present an ontology-based online image annotation tool WNtags and demonstrate its usefulness in several typical multimedia retrieval tasks using International Affective Picture System emotionally annotated image database. WNtags is built around WordNet lexical ontology but considers Suggested Upper Merged Ontology as the preferred labeling formalism. WNtags uses sets of weighted WordNet synsets as high-level image semantic descriptors and query matching is performed with word stemming and node distance metrics. We also elaborate our near future plans to expand image content description with induced affect as in stimuli for research of human emotion and attention.


Semantic Development and Integration of Standards for Adoption and Interoperability

IEEE Computer

Semantic applications can help commercial applications perform quickly and reliably by improving ecosystem interoperability. Converting and integrating current standards specifications to OWL models could support the adoption of semantic models, as well as machine-processable standards compliance and data interoperability.


The Data Complexity of Description Logic Ontologies

arXiv.org Artificial Intelligence

We analyze the data complexity of ontology-mediated querying where the ontologies are formulated in a description logic (DL) of the ALC family and queries are conjunctive queries, positive existential queries, or acyclic conjunctive queries. Our approach is non-uniform in the sense that we aim to understand the complexity of each single ontology instead of for all ontologies formulated in a certain language. While doing so, we quantify over the queries and are interested, for example, in the question whether all queries can be evaluated in polynomial time w.r.t. a given ontology. Our results include a PTime/coNP-dichotomy for ontologies of depth one in the description logic ALCFI, the same dichotomy for ALC- and ALCI-ontologies of unrestricted depth, and the non-existence of such a dichotomy for ALCF-ontologies. For the latter DL, we additionally show that it is undecidable whether a given ontology admits PTime query evaluation. We also consider the connection between PTime query evaluation and rewritability into (monadic) Datalog.


Neural Wikipedian: Generating Textual Summaries from Knowledge Base Triples

arXiv.org Artificial Intelligence

Most people do not interact with Semantic Web data directly. Unless they have the expertise to understand the underlying technology, they need textual or visual interfaces to help them make sense of it. We explore the problem of generating natural language summaries for Semantic Web data. This is non-trivial, especially in an open-domain context. To address this problem, we explore the use of neural networks. Our system encodes the information from a set of triples into a vector of fixed dimensionality and generates a textual summary by conditioning the output on the encoded vector. We train and evaluate our models on two corpora of loosely aligned Wikipedia snippets and DBpedia and Wikidata triples with promising results.


Position Paper: Rational Behavior Model (RBM) and Human-Robot Ethical Constraints Using Mission Execution Ontology (MEO)

AAAI Conferences

Autonomous systems can be ethically supervised by humans without constant communications. Adding constraints such as no-fly zones, time limitations, permission prerequisites etc. to mission orders allows operators to legally and ethically control mobile systems that have the potential for deliberate (or unintentional) lethal force. Ethical control can be practically achieved by providing parsable (and ethically validatable) orders to diverse unmanned systems.


Onboarding to Enterprise Knowledge Graphs - DATAVERSITY

@machinelearnbot

Enterprise Knowledge Graph vendors are working hard to find their place in the heart of businesses, helping them do more with and get more out of their mountains of data. Recently, for example, Stardog has adopted its leading Knowledge Graph platform to be "FIBO-aware," mapping to the Financial Industry Business Ontology (FIBO) semantic standards out-of-the-box. GraphPath launched what it says is the first Knowledge-Graph-as-a-Service (KGaaS) platform. And Maana, with its Knowledge Graph-centered Knowledge Platform, has been talking up its partnerships with clients like Shell to drive digital transformation efforts. As part of these efforts, work is underway to make it easier for businesses to adopt these solutions – for experts like data engineers who will manage the graphs, of course, but also for the business users who will consume data from them via different applications that developers create.


Social Participation Ontology: community documentation, enhancements and use examples

arXiv.org Artificial Intelligence

Participatory democracy advances in virtually all governments and especially in South America which exhibits a mixed culture and social predisposition. This article presents the "Social Participation Ontology" (OPS from the Brazilian name \emph{Ontologia de Participa\c{c}\~ao Social}) implemented in compliance with the Web Ontology Language standard (OWL) for fostering social participation, specially in virtual platforms. The entities and links of OPS were defined based on an extensive collaboration of specialists. It is shown that OPS is instrumental for information retrieval from the contents of the portal, both in terms of the actors (at various levels) as well as mechanisms and activities. Significantly, OPS is linked to other OWL ontologies as an upper ontology and via FOAF and BFO as higher upper ontologies, which yields sound organization and access of knowledge and data. In order to illustrate the usefulness of OPS, we present results on ontological expansion and integration with other ontologies and data. Ongoing work involves further adoption of OPS by the official Brazilian federal portal for social participation and NGO s, and further linkage to other ontologies for social participation.


DAGGER: A sequential algorithm for FDR control on DAGs

arXiv.org Machine Learning

We propose a top-down algorithm for multiple testing on directed acyclic graphs (DAGs), where nodes represent hypotheses and edges specify a partial ordering in which hypotheses must be tested. The procedure is guaranteed to reject a sub-DAG with bounded false discovery rate (FDR) while satisfying the logical constraint that a rejected node's parents must also be rejected. It is designed for sequential testing settings, when the DAG structure is known a priori, but the p-values are obtained selectively (such as sequential conduction of experiments), but the algorithm is also applicable in non-sequential settings when all p-values can be calculated in advance (such as variable/model selection). Our DAGGER algorithm, shorthand for Greedily Evolving Rejections on DAGs, allows for independence, positive or arbitrary dependence of the p-values, and is guaranteed to work on two different types of DAGs: (a) intersection DAGs in which all nodes are intersection hypotheses, with parents being supersets of children, or (b) general DAGs in which all nodes may be elementary hypotheses. The DAGGER procedure has the appealing property that it specializes to known algorithms in the special cases of trees and line graphs, and simplifies to the classic Benjamini-Hochberg procedure when the DAG has no edges. We explore the empirical performance of DAGGER using simulations, as well as a real dataset corresponding to a gene ontology DAG, showing that it performs favorably in terms of time and power.


a16z Podcast: The Taxonomy of Collective Knowledge – Andreessen Horowitz

@machinelearnbot

What do disease diagnostics, language learning, and image recognition have in common? All depend on the organization of collective intelligence: data ontologies. In this episode of the a16z Podcast, guests Luis von Ahn, founder of reCaptcha and Duolingo, Jay Komarneni, founder of HumanDX, a16z General Partner Vijay Pande, and a16z Partner Malinka Walaliyadde break down what data ontologies are, from the philosophical (Wittgenstein and Wikipedia!) to the practical (a doctor identifying a diagnosis), particularly as they apply to the field of healthcare and diagnosis. It is data ontologies, in fact, that enable not only human computation -- but that allow us to map out, structure, and scale knowledge creation online, providing order to how we organize massive amounts of information so that humans and machines can coordinate in a way that both understand.