Goto

Collaborating Authors

 Ontologies


Graph-Sparse LDA: A Topic Model with Structured Sparsity

AAAI Conferences

Topic modeling is a powerful tool for uncovering latent structure in many domains, including medicine, finance, and vision. The goals for the model vary depending on the application: sometimes the discovered topics are used for prediction or another downstream task. In other cases, the content of the topic may be of intrinsic scientific interest. Unfortunately, even when one uses modern sparse techniques, discovered topics are often difficult to interpret due to the high dimensionality of the underlying space. To improve topic interpretability, we introduce Graph-Sparse LDA, a hierarchical topic model that uses knowledge of relationships between words (e.g., as encoded by an ontology). In our model, topics are summarized by a few latent concept-words from the underlying graph that explain the observed words. Graph-Sparse LDA recovers sparse, interpretable summaries on two real-world biomedical datasets while matching state-of-the-art prediction performance.


Towards Knowledge-Driven Annotation

AAAI Conferences

While the Web of data is attracting increasing interest and rapidly growing in size, the major support of information on the surface Web are still multimedia documents. Semantic annotation of texts is one of the main processes that are intended to facilitate meaning-based information exchange between computational agents. However, such annotation faces several challenges such as the heterogeneity of natural language expressions, the heterogeneity of documents structure and context dependencies. While a broad range of annotation approaches rely mainly or partly on the target textual context to disambiguate the extracted entities, in this paper we present an approach that relies mainly on formalized-knowledge expressed in RDF datasets to categorize and disambiguate noun phrases. In the proposed method, we represent the reference knowledge bases as co-occurrence matrices and the disambiguation problem as a 0-1 Integer Linear Programming (ILP) problem. The proposed approach is unsupervised and can be ported to any RDF knowledge base. The system implementing this approach, called KODA, shows very promising results w.r.t. state-of-the-art annotation tools in cross-domain experimentations.


Structured Embedding via Pairwise Relations and Long-Range Interactions in Knowledge Base

AAAI Conferences

We consider the problem of embedding entities and relations of knowledge bases into low-dimensional continuous vector spaces (distributed representations). Unlike most existing approaches, which are primarily efficient for modelling pairwise relations between entities, we attempt to explicitly model both pairwise relations and long-range interactions between entities, by interpreting them as linear operators on the low-dimensional embeddings of the entities. Therefore, in this paper we introduces Path-Ranking to capture the long-range interactions of knowledge graph and at the same time preserve the pairwise relations of knowledge graph; we call it 'structured embedding via pairwise relation and long-range interactions' (referred to as SePLi). Comparing with the-state-of-the-art models, SePLi achieves better performances of embeddings.


Instance-Driven Ontology Evolution in DL-Lite

AAAI Conferences

The development and maintenance of large and complex ontologies are often time-consuming and error-prone. Thus, automated ontology learning and evolution have attracted intensive research interest. In data-centric applications where ontologies are designed from the data or automatically learnt from it, when new data instances are added that contradict the ontology, it is often desirable to incrementally revise the ontology according to the added data. In description logics, this problem can be intuitively formulated as the operation of TBox contraction, i.e., rational elimination of certain axioms from the logical consequences of a TBox, and it is w.r.t. an ABox. In this paper we introduce a model-theoretic approach to such a contraction problem by using an alternative semantic characterisation of DL-Lite TBoxes. We show that entailment checking (without necessarily first computing the contraction result) is in coNP, which does not shift the corresponding complexity in propositional logic, and the problem is tractable when the size of the new data is bounded.


Answering Conjunctive Queries over EL Knowledge Bases with Transitive and Reflexive Roles

AAAI Conferences

Answering conjunctive queries (CQs) over EL knowledge bases (KBs) with complex role inclusions is PSPACE-hard and in PSPACE in certain cases; however, if complex role inclusions are restricted to role transitivity, a tight upper complexity bound has so far been unknown. Furthermore, the existing algorithms cannot handle reflexive roles, and they are not practicable. Finally, the problem is tractable for acyclic CQs and ELH, and NP-complete for unrestricted CQs and ELHO KBs. In this paper we complete the complexity landscape of CQ answering for several important cases. In particular, we present a practicable NP algorithm for answering CQs over ELHOs KBsโ€”a logic containing all of OWL 2 EL, but with complex role inclusions restricted to role transitivity. Our preliminary evaluation suggests that the algorithm can be suitable for practical use. Moreover, we show that, even for a restricted class of so-called arborescent acyclic queries, CQ answering over EL KBs becomes NP-hard in the presence of either transitive or reflexive roles. Finally, we show that answering arborescent CQs over ELHO KBs is tractable, whereas answering acyclic CQs is NP-hard.


Incremental Update of Datalog Materialisation: the Backward/Forward Algorithm

AAAI Conferences

Datalog-based systems often materialise all consequences of a datalog program and the data, allowing users' queries to be evaluated directly in the materialisation. This process, however, can be computationally intensive, so most systems update the materialisation incrementally when input data changes. We argue that existing solutions, such as the well-known Delete/Rederive (DRed) algorithm, can be inefficient in cases when facts have many alternate derivations. As a possible remedy, we propose a novel Backward/Forward (B/F) algorithm that tries to reduce the amount of work by a combination of backward and forward chaining. In our evaluation, the B/F algorithm was several orders of magnitude more efficient than the DRed algorithm on some inputs, and it was never significantly less efficient.



Inference Graphs: Combining Natural Deduction and Subsumption Inference in a Concurrent Reasoner

AAAI Conferences

There are very few reasoners which combine natural deduction and subsumption reasoning, and there are none which do so while supporting concurrency. Inference Graphs are a graph-based inference mechanism using an expressive first-order logic, capable of subsumption and natural deduction reasoning using concurrency. Evaluation of concurrency characteristics on a combination natural deduction and subsumption reasoning problem has shown linear speedup with the number of processors.


Ontology-Based Information Extraction with a Cognitive Agent

AAAI Conferences

Machine reading is a relatively new field that features computer programs designed to read flowing text and extract fact assertions expressed by the narrative content. This task involves two core technologies: natural language processing (NLP) and information extraction (IE). In this paper we describe a machine reading system that we have developed within a cognitive architecture. We show how we have integrated into the framework several levels of knowledge for a particular domain, ideas from cognitive semantics and construction grammar, plus tools from prior NLP and IE research. The result is a system that is capable of reading and interpreting complex and fairly idiosyncratic texts in the family history domain. We describe the architecture and performance of the system. After presenting the results from several evaluations that we have carried out, we summarize possible future directions.


Extracting Bounded-Level Modules from Deductive RDF Triplestores

AAAI Conferences

We present a novel semantics for extracting bounded-level modules from RDF ontologies and databases augmented with safe inference rules, a la Datalog. Dealing with a recursive rule language poses challenging issues for defining the module semantics, and also makes module extraction algorithmically unsolvable in some cases. Our results include a set of module extraction algorithms compliant with the novel semantics. Experimental results show that the resulting framework is effective in extracting expressive modules from RDF datasets with formal guarantees, whilst controlling their succinctness.