Ontologies
SPARQL is the new King of all Data Scientist's tools
Inspired by the development of semantic technologies in recent years, in statistical analysis field the traditional methodology of designing, publishing and consuming statistical datasets is evolving to so-called "Linked Statistical Data" by associating semantics with dimensions, attributes and observation values based on Linked Data design principles. The representation of datasets is no longer a combination of magic words and numbers. Everything is becoming meaningful when URIs replace their positions as dereferencable resources, which further establishes the relations between resources implicitly and automatically. Different datasets are no longer isolated and all datasets share a globally, uniquely and uniformly defined structure. At this point, it is time to start building data-oriented applications and services with the traditional statistical computing languages such as R, while benefiting from the omnipotent semantic power of the SPARQL query language.
Marketing vs. Machine: Are the Bots Coming for Your Job? [UML]
The Salesforce announcement of Einstein this week -- impressive as it was -- reminded me that marketers sometimes use terms like machine learning, artificial intelligence (AI) and even automation interchangeably. Businesses have long been infatuated with the word intelligence โ business intelligence, relationship intelligence, and media intelligence โ are all overused terms from the last decade, for example. In my mind, all of these descriptors are a stretch. Intelligence means something very specific: it's the disposition, composition and strength of an enemy. Yet business executives love war analogies and here we are mashing these terms together again. We like to use the term AI because it sounds more sophisticated.
Senior Cognitive Expert/siliconarmada.com
In this role, you'll be part of our European consulting team that is helping clients to design and deliver innovative solutions based on Cognitive Computing approaches - in particular based on IBM WATSON technology. We're looking for experienced professionals who have proven expertise in one or multiple of the areas of Artificial Intelligence, Natural Language Processing, Semantic Technologies, Information Retrieval or Machine Learning. You'll provide advisory and implementation expertise to our clients including: use case and business case development for Cognitive Computing solutions; proof of concept execution to prove the value of Cognitive Computing use cases; solution outline and design of Cognitive Computing systems; as well as supporting business development activities.You'll have strong experience in designing and building innovative solutions based on the above technologies but you will also be have the expertise and architectural mindset to relate and integrate such solutions with existing client system infrastructures, such as e.g. Proven hands-on experience in conducting analyses on unstructured as well as structured / semi-structured data and in working with state-of the art technologies in Cognitive Computing will be expected. Examples of such technologies include but are not limited to: Natural Language Processing or Information Retrieval (e.g.
An Evolutionary Algorithm to Learn SPARQL Queries for Source-Target-Pairs: Finding Patterns for Human Associations in DBpedia
Hees, Jรถrn, Bauer, Rouven, Folz, Joachim, Borth, Damian, Dengel, Andreas
Efficient usage of the knowledge provided by the Linked Data community is often hindered by the need for domain experts to formulate the right SPARQL queries to answer questions. For new questions they have to decide which datasets are suitable and in which terminology and modelling style to phrase the SPARQL query. In this work we present an evolutionary algorithm to help with this challenging task. Given a training list of source-target node-pair examples our algorithm can learn patterns (SPARQL queries) from a SPARQL endpoint. The learned patterns can be visualised to form the basis for further investigation, or they can be used to predict target nodes for new source nodes. Amongst others, we apply our algorithm to a dataset of several hundred human associations (such as "circle - square") to find patterns for them in DBpedia. We show the scalability of the algorithm by running it against a SPARQL endpoint loaded with > 7.9 billion triples. Further, we use the resulting SPARQL queries to mimic human associations with a Mean Average Precision (MAP) of 39.9 % and a Recall@10 of 63.9 %.
Datalog+- Ontology Consolidation
Deagustini, Cristhian Ariel D., Martinez, Maria Vanina, Falappa, Marcelo A., Simari, Guillermo R.
Knowledge bases in the form of ontologies are receiving increasing attention as they allow to clearly represent both the available knowledge, which includes the knowledge in itself and the constraints imposed to it by the domain or the users. In particular, Datalogยฑ ontologies are attractive because of their property of decidability and the possibility of dealing with the massive amounts of data in real world environments; however, as it is the case with many other ontological languages, their application in collaborative environments often lead to inconsistency related issues. In this paper we introduce the notion of incoherence regarding Datalogยฑ ontologies, in terms of satisfiability of sets of constraints, and show how under specific conditions incoherence leads to inconsistent Datalogยฑ ontologies. The main contribution of this work is a novel approach to restore both consistency and coherence in Datalogยฑ ontologies. The proposed approach is based on kernel contraction and restoration is performed by the application of incision functions that select formulas to delete. Nevertheless, instead of working over minimal incoherent/inconsistent sets encountered in the ontologies, our operators produce incisions over non-minimal structures called clusters. We present a construction for consolidation operators, along with the properties expected to be satisfied by them. Finally, we establish the relation between the construction and the properties by means of a representation theorem. Although this proposal is presented for Datalogยฑ ontologies consolidation, these operators can be applied to other types of ontological languages, such as Description Logics, making them apt to be used in collaborative environments like the Semantic Web.
RDF Database Systems: Triples Storage and SPARQL Query Processing: Olivier Curรฉ, Guillaume Blin: 9780127999579: Amazon.com: Books
Olivier Curรฉ is an associate professor of computer science at the Universitรฉ Paris-Est in France and is researching at the CNRS LIGM lab. He holds a Ph.D. in Artificial Intelligence from the Universitรฉ de Paris V, France and has published three book chapters, eight journal articles and more than 50 papers in international, peer-reviewed conferences in the fields of databases, semantic web and ontologies. Professor Curรฉ has organized workshops including Ambient Data Integration (ADI) at On the Move (OTM) conference in 2008, 2009 and 2010. He has received three cooperative research grants to work with the Database and Information System research team of Pr. In 2013, Professor Curรฉ received a grant for a France-Stanford collaboration to conduct research with Stanford's BioMedical Informatics Research (BMIR) laboratory.
Computing Repairs of Inconsistent DL-Programs over EL Ontologies
Eiter, Thomas, Fink, Michael, Stepanova, Daria
Description Logic (DL) ontologies and non-monotonic rules are two prominent Knowledge Representation (KR) formalisms with complementary features that are essential for various applications. Nonmonotonic Description Logic (DL) programs combine these formalisms thus providing support for rule-based reasoning on top of DL ontologies using a well-defined query interface represented by so-called DL-atoms. Unfortunately, interaction of the rules and the ontology may incur inconsistencies such that a DL-program lacks answer sets (i.e., models), and thus yields no information. This issue is addressed by recently defined repair answer sets, for computing which an effective practical algorithm was proposed for DL-Lite A ontologies that reduces a repair computation to constraint matching based on so-called support sets. However, the algorithm exploits particular features of DL-Lite A and can not be readily applied to repairing DL-programs over other prominent DLs like EL. compared to DL-Lite A , in EL support sets may neither be small nor only few support sets might exist, and completeness of the algorithm may need to be given up when the support information is bounded. We thus provide an approach for computing repairs for DL-programs over EL ontologies based on partial (incomplete) support families. The latter are constructed using datalog query rewriting techniques as well as ontology approximation based on logical difference between EL-terminologies. We show how the maximal size and number of support sets for a given DL-atom can be estimated by analyzing the properties of a support hypergraph, which characterizes a relevant set of TBox axioms needed for query derivation. We present a declarative implementation of the repair approach and experimentally evaluate it on a set of benchmark problems; the promising results witness practical feasibility of our repair approach.
A Hybrid Approach to Query Answering under Expressive Datalog+/-
Milani, Mostafa, Cali, Andrea, Bertossi, Leopoldo
Datalog+/- is a family of ontology languages that combine good computational properties with high expressive power. Datalog+/- languages are provably able to capture the most relevant Semantic Web languages. In this paper we consider the class of weakly-sticky (WS) Datalog+/- programs, which allow for certain useful forms of joins in rule bodies as well as extending the well-known class of weakly-acyclic TGDs. So far, only non-deterministic algorithms were known for answering queries on WS Datalog+/- programs. We present novel deterministic query answering algorithms under WS Datalog+/-. In particular, we propose: (1) a bottom-up grounding algorithm based on a query-driven chase, and (2) a hybrid approach based on transforming a WS program into a so-called sticky one, for which query rewriting techniques are known. We discuss how our algorithms can be optimized and effectively applied for query answering in real-world scenarios.
BMC Bioinformatics
Precision medicine [1] has become a most promising methodology for clinical medicine, which relies heavily on rich biomedical knowledge and information of individual patients such as genetic content, living habits, environmental factors, etc. [2]. US National Academy of Sciences claims in a 2011 research report that a biomedical knowledge network based on biological data and knowledge is necessary for precision medicine [3]. How to compute relatedness between concepts and discover valuable information and implicit knowledge effectively and efficiently from such hybrid knowledge (both structural and non-structural) networks is a key of paramount importance to the realization of precision medicine, and a huge challenge facing the biomedical research community. It is agreeable that the knowledge network should include all the knowledge sources, information systems and repositories in biomedicine available today and in the future, spanning the whole spectrum of structural and non-structural information and knowledge. One type of important knowledge sources is ontology.