Goto

Collaborating Authors

 Industry


Suggesting New Plot Elements for an Interactive Story

AAAI Conferences

We present a system that uses evolutionary optimization to suggest new story-world events that, if added to an existing interactive story, would most improve the average interactive experience, according to author-supplied criteria. In doing so, we aim to apply some of the ideas from drama-managed storytelling, such as authorial aesthetic control, in an unguided setting more akin to emergent storytelling: rather than guiding or directing a player towards an experience in line with an author's aesthetic goals, the storyworld is augmented with new content in a way that will tend to align with an author's goals, even if the player is not guided. In this paper, we present an offline system, and demonstrate its robustness to a number of variations in authorial criteria and player-model assumptions. This is intended to lay the groundwork for a future system that would generate new content online, allowing for interactive stories larger than those explicitly written by the author.


Corpus Annotation in Service of Intelligent Narrative Technologies

AAAI Conferences

Annotated corpora have stimulated great advances in the language sciences. The time is ripe to bring that same stimulation, and consequent benefits, to computational approaches to narrative. I describe an effort to construct a corpus of semantically annotated stories. I outline the structure of the corpus, a structure which colloquially can be described as a "handful of handfuls." One handful of the corpus has already been constructed, viz., 18k words of Russian folktales. There are two handfuls under construction: legal cases focused on the area of probable cause, and stories from Islamist Extremist Jihadists. Four more handfuls are being planned: folktales from Chinese, English, and a West Asian culture, and stories of international conventional and cyber conflicts. There are numerous additional handfuls under discussion. The main focus of the corpus so far has been on textual materials that are annotated for their surface semantics using conventional annotation tools and techniques; nonetheless, there are numerous novel dimensions along which the corpus might grow and become useful for different communities. In particular I propose for discussion the outlines of a few novel sources, annotation schemes, and collection methodologies that could potentially make the corpus of great use to the interactive narrative or narrative generation communities.


Towards a Computational Model of Narrative Visualization

AAAI Conferences

The task of narrative visualization has been the subject of increasing interest in recent years. Much like data visualization, narrative visualization offers users an informative and aesthetically pleasing perspective on “storydata.” Automatically creating visual representations ofnarratives poses significant computational challenges due to the complex affective and causal elements, among other things, that must be realized in visualizations. In addition, narratives that are composed by novice writers pose additional challenges due to the disfluencies stemming from ungrammatical text. In this paper, we introduce the NARRATIVE THEATRE, a narrative visualization system under development in our laboratory that generates narrative visualizations from middle school writers’ text. The NARRATIVE THEATRE consists of a rich writing interface, a robust natural language processor, a narrative reasoner, and a storyboard generator. We discuss design issues bearing on narrative visualization, introduce the NARRATIVE THEATRE, and describe narrative corpora that have been collected to study narrative visualization. We conclude with a discussion of a narrative visualization research agenda.


Characterizing and Improving Generalized Belief Propagation Algorithms on the 2D Edwards-Anderson Model

arXiv.org Artificial Intelligence

We study the performance of different message passing algorithms in the two dimensional Edwards Anderson model. We show that the standard Belief Propagation (BP) algorithm converges only at high temperature to a paramagnetic solution. Then, we test a Generalized Belief Propagation (GBP) algorithm, derived from a Cluster Variational Method (CVM) at the plaquette level. We compare its performance with BP and with other algorithms derived under the same approximation: Double Loop (DL) and a two-ways message passing algorithm (HAK). The plaquette-CVM approximation improves BP in at least three ways: the quality of the paramagnetic solution at high temperatures, a better estimate (lower) for the critical temperature, and the fact that the GBP message passing algorithm converges also to non paramagnetic solutions. The lack of convergence of the standard GBP message passing algorithm at low temperatures seems to be related to the implementation details and not to the appearance of long range order. In fact, we prove that a gauge invariance of the constrained CVM free energy can be exploited to derive a new message passing algorithm which converges at even lower temperatures. In all its region of convergence this new algorithm is faster than HAK and DL by some orders of magnitude.


Learning Sentence-internal Temporal Relations

arXiv.org Artificial Intelligence

In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects.


Discovering patterns of correlation and similarities in software project data with the Circos visualization tool

arXiv.org Artificial Intelligence

Software cost estimation based on multivariate data from completed projects requires the building of efficient models. These models essentially describe relations in the data, either on the basis of correlations between variables or of similarities between the projects. The continuous growth of the amount of data gathered and the need to perform preliminary analysis in order to discover patterns able to drive the building of reasonable models, leads the researchers towards intelligent and time-saving tools which can effectively describe data and their relationships. The goal of this paper is to suggest an innovative visualization tool, widely used in bioinformatics, which represents relations in data in an aesthetic and intelligent way. In order to illustrate the capabilities of the tool, we use a well known dataset from software engineering projects.


Identifying relationships between drugs and medical conditions: winning experience in the Challenge 2 of the OMOP 2010 Cup

arXiv.org Machine Learning

There is a growing interest in using a longitudinal observational databases to detect drug safety signal. In this paper we present a novel method, which we used online during the OMOP Cup. We consider homogeneous ensembling, which is based on random re-sampling (known, also, as bagging) as a main innovation compared to the previous publications in the related field. This study is based on a very large simulated database of the 10 million patients records, which was created by the Observational Medical Outcomes Partnership (OMOP). Compared to the traditional classification problem, the given data are unlabelled. The objective of this study is to discover hidden associations between drugs and conditions. The main idea of the approach, which we used during the OMOP Cup is to compare the numbers of observed and expected patterns. This comparison may be organised in several different ways, and the outcomes (base learners) may be quite different as well. It is proposed to construct the final decision function as an ensemble of the base learners. Our method was recognised formally by the Organisers of the OMOP Cup as a top performing method for the Challenge N2.


Linearized Additive Classifiers

arXiv.org Artificial Intelligence

We revisit the additive model learning literature and adapt a penalized spline formulation due to Eilers and Marx [4], to train additive classifiers efficiently. We also propose two new embeddings based two classes of orthogonal basis with orthogonal derivatives, which can also be used to efficiently learn additive classifiers. This paper follows the popular theme in the current literature where kernel SVMs are learned much more efficiently using a approximate embedding and linear machine. In this paper we show that spline basis are especially well suited for learning additive models because of their sparsity structure and the ease of computing the embedding which enables one to train these models in an online manner, without incurring the memory overhead of precomputing the storing the embeddings. We show interesting connections between B-Spline basis and histogram intersection kernel and show that for a particular choice of regularization and degree of the B-Splines, our proposed learning algorithm closely approximates the histogram intersection kernel SVM. This enables one to learn additive models with almost no memory overhead compared to fast a linear solver, such as LIBLINEAR, while being only 5 6 slower on average. On two large scale image classification datasets, MNIST and Daimler Chrysler pedestrians, the proposed additive classifiers are as accurate as the kernel SVM, while being two orders of magnitude faster to train.


Well-Definedness and Efficient Inference for Probabilistic Logic Programming under the Distribution Semantics

arXiv.org Artificial Intelligence

The distribution semantics is one of the most prominent approaches for the combination of logic programming and probability theory. Many languages follow this semantics, such as Independent Choice Logic, PRISM, pD, Logic Programs with Annotated Disjunctions (LPADs) and ProbLog. When a program contains functions symbols, the distribution semantics is well-defined only if the set of explanations for a query is finite and so is each explanation. Well-definedness is usually either explicitly imposed or is achieved by severely limiting the class of allowed programs. In this paper we identify a larger class of programs for which the semantics is well-defined together with an efficient procedure for computing the probability of queries. Since LPADs offer the most general syntax, we present our results for them, but our results are applicable to all languages under the distribution semantics. We present the algorithm "Probabilistic Inference with Tabling and Answer subsumption" (PITA) that computes the probability of queries by transforming a probabilistic program into a normal program and then applying SLG resolution with answer subsumption. PITA has been implemented in XSB and tested on six domains: two with function symbols and four without. The execution times are compared with those of ProbLog, cplint and CVE, PITA was almost always able to solve larger problems in a shorter time, on domains with and without function symbols.


Autonomous Agents Coordination: Action Languages meet CLP(FD) and Linda

arXiv.org Artificial Intelligence

Representing and reasoning in multi-agent domains are two of the most active research areas in multi-agent system (MAS) research. The literature in this area is extensive, and it provides a plethora of logics for representing and reasoning about various aspects of MAS domains, e.g., [20, 14, 24, 22, 12]. A large number of the logics proposed in the literature have been designed to specifically focus on particular aspects of the problem of modeling MAS, often justified by a specific application scenario. This makes them suitable to address specific subsets of the general features required to model real-world MAS domains. The task of generalizing some of these existing proposals to create a uniform and comprehensive framework for modeling several different aspects of MAS domains is an open problem. Although we do not dispute the possibility of extending several of these existing proposals in various directions, the task does not seem easy. Similarly, a variety of multi-agent programming platforms have been proposed, mostly in the style of multi-agent programming languages, like Jason [3], ConGolog [9], 3APL [7], GOAL [8], but with limited planning capabilities. Our effort in this paper is focused on the development of a novel action language for multi-agent systems.