Cimiano, Philipp
Similarity-weighted Construction of Contextualized Commonsense Knowledge Graphs for Knowledge-intense Argumentation Tasks
Plenz, Moritz, Opitz, Juri, Heinisch, Philipp, Cimiano, Philipp, Frank, Anette
Arguments often do not make explicit how a conclusion follows from its premises. To compensate for this lack, we enrich arguments with structured background knowledge to support knowledge-intense argumentation tasks. We present a new unsupervised method for constructing Contextualized Commonsense Knowledge Graphs (CCKGs) that selects contextually relevant knowledge from large knowledge graphs (KGs) efficiently and at high quality. Our work goes beyond context-insensitive knowledge extraction heuristics by computing semantic similarity between KG triplets and textual arguments. Using these triplet similarities as weights, we extract contextualized knowledge paths that connect a conclusion to its premise, while maximizing similarity to the argument. We combine multiple paths into a CCKG that we optionally prune to reduce noise and raise precision. Intrinsic evaluation of the quality of our graphs shows that our method is effective for (re)constructing human explanation graphs. Manual evaluations in a large-scale knowledge selection setup confirm high recall and precision of implicit CSK in the CCKGs. Finally, we demonstrate the effectiveness of CCKGs in a knowledge-insensitive argument quality rating task, outperforming strong baselines and rivaling a GPT-3 based system.
Evaluating Architectural Choices for Deep Learning Approaches for Question Answering over Knowledge Bases
Hakimov, Sherzod, Jebbara, Soufian, Cimiano, Philipp
Abstract--The task of answering natural language questions over knowledge bases has received wide attention in recent years. Various deep learning architectures have been proposed for this task. However, architectural design choices are typically not systematically compared nor evaluated under the same conditions. In this paper, we contribute to a better understanding of the impact of architectural design choices by evaluating four different architectures under the same conditions. We address the task of answering simple questions, consisting in predicting the subject and predicate of a triple given a question. In order to provide a fair comparison of different architectures, we evaluate them under the same strategy for inferring the subject, and compare different architectures for inferring the predicate. The architecture for inferring the subject is based on a standard LSTM model trained to recognize the span of the subject in the question and on a linking component that links the subject span to an entity in the knowledge base. The architectures for predicate inference are based on i) a standard softmax classifier ranging over all predicates as output, ii) a model that predicts a low-dimensional encoding of the property given entity representation and question, iii) a model that learns to score a pair of subject and predicate given the question as well as iv) a model based on the well-known FastText model. The comparison of architectures shows that FastText provides better results than other architectures. I. INTRODUCTION The task of Question Answering (QA) has received increasing attentionin the last few years.
Towards Action Representation within the Framework of Conceptual Spaces: Preliminary Results
Beyer, Oliver (CITEC Bielefeld University) | Griffiths, Sascha (CITEC, Bielefeld University) | Cimiano, Philipp (CITEC, Bielefeld University)
We propose an approach for the representation of actions based on the conceptual spaces framework developed by Gärdenfors (2004). Action categories are regarded as properties in the sense of Gärdenfors (2011) and are understood as convex regions in action space. Action categories are mainly described by a force signature that represents the forces that act upon a main trajector involved in the action. This force signature is approximated via a representation that specifies the time-indexed position of the trajector relative to several landmarks. We also present a computational approach to extract such representations from video data. We present results on the Motionese dataset consisting of videos of parents demonstrating actions on objects to their children. We evaluate the representations on a clustering and a classification task showing that, while our representations seems to be reasonable, only a handful of actions can be discriminated reliably.
A Systematic Investigation of Blocking Strategies for Real-Time Classification of Social Media Content into Events
Reuter, Timo (CITEC, Universität Bielefeld) | Cimiano, Philipp (CITEC, Universität Bielefeld)
Events play a prominent role in our lives, such that many social media documents describe or are related to some event. Organizing social media documents with respect to events thus seems a promising approach to better manage and organize the ever-increasing amount of user-generated content in social media applications. It would support the navigation of data by events or allow one to get notified about new postings related to the events one is interested in, just to name two applications. A challenge is to automatize this process so that incoming documents can be assigned to their corresponding event without any user intervention. We present a system that is able to classify a stream of social media data into a growing and evolving set of events. In order to scale up to the data sizes and data rates in social media applications, the use of a candidate retrieval or blocking step is crucial to reduce the number of events that are considered as potential candidates to which the incoming data point could belong to.In this paper we present and experimentally compare different blocking strategies along their cost vs. effectiveness tradeoff.We show that using a blocking strategy that selects the 60 closest events with respect to upload time, we reach F-Measures of about 85.1% while being able to process the incoming documents within 32ms on average. We thus provide a principled approach supporting to scale up classification of social media documents into events and to process the incoming stream of documents in real time.
Scalable Event-Based Clustering of Social Media Via Record Linkage Techniques
Reuter, Timo (CITEC, University of Bielefeld) | Cimiano, Philipp (CITEC, University of Bielefeld) | Drumond, Lucas (University of Hildesheim) | Buza, Krisztian (University of Hildesheim) | Schmidt-Thieme, Lars (University of Hildesheim)
We tackle the problem of grouping content available in social media applications such as Flickr, Youtube, Panoramino etc. into clusters of documents describing the same event. This task has been referred to as event identification before. We present a new formalization of the event identification task as a record linkage problem and show that this formulation leads to a principled and highly efficient solution to the problem. We present results on two datasets derived from Flickr — last.fm and upcoming — comparing the results in terms of Normalized Mutual Information and F-Measure with respect to several baselines, showing that a record linkage approach outperforms all baselines as well as a state-of-the-art system. We demonstrate that our approach can scale to large amounts of data, reducing the processing time considerably compared to a state-of-the-art approach. The scalability is achieved by applying an appropriate blocking strategy and relying on a Single Linkage clustering algorithm which avoids the exhaustive computation of pairwise similarities.