Grammars & Parsing
Toward Humanlike Task-Based Dialogue Processing for Human Robot Interaction
Scheutz, Matthias (Tufts University) | Cantrell, Rehj (Indiana University) | Schermerhorn, Paul (Indiana University)
Many human social exchanges and coordinated activities critically involve dialogue interactions. Hence, we need to develop natural humanlike dialogue processing mechanisms for future robots if they are to interact with humans in natural ways. In this article we discuss the challenges of designing such flexible dialogue-based robotic systems. We report results from data we collected in human interaction experiments in the context of a search task and show how we can use these results to build more flexible robotic architectures that are starting to address the challenges of task-based humanlike natural language dialogues on robots.
Image Parsing with Stochastic Scene Grammar
This paper proposes a parsing algorithm for scene understanding which includes four aspects: computing 3D scene layout, detecting 3D objects (e.g. furniture), detecting 2D faces (windows, doors etc.), and segmenting background. In contrast to previous scene labeling work that applied discriminative classifiers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines. The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules rep- resent an ensemble of visual entities. Contextual relations: (i) Cooperative “+” relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive “-” relations represents negative links between competing entities, such as mutually exclusive boxes. We design an efficient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene configurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations. (ii) Sampling: It jumps between alternative structures (clusters) in each layer of the hierarchy to find the most probable configuration (represented by a parse tree). In our experiment, we demonstrate the superiority of our algorithm over existing methods on public dataset. In addition, our approach achieves richer structures in the parse tree.
How to Generate Cloze Questions from Definitions: A Syntactic Approach
Gates, Donna Marie (Carnegie Mellon University)
This paper discusses the implementation and evaluation of automatically generated cloze questions in the style of the definitions found in Collins COBUILD English language learner’s dictionary. The definitions and the cloze questions are used in an automated reading tutor to help second and third grade students learn new vocabulary. A parser provides syntactic phrase structure trees for the definitions. With these parse trees as input, a pattern matching program uses a set of syntactic patterns to extract the phrases that make up the cloze question answers and distracters.
Using Automatic Question Generation to Evaluate Questions Generated by Children
Chen, Wei (Carnegie Mellon University) | Mostow, Jack (Carnegie Mellon University) | Aist, Gregory (Iowa State University)
This paper shows that automatically generated questions can help classify children’s spoken responses to a reading tutor teaching them to generate their own questions. We use automatic question generation to model and classify children’s prompted spoken questions about stories. On distinguishing complete and incomplete questions from irrelevant speech and silence, a language model built from automatically generated questions out-performs a trigram language model that does not exploit the structure of questions.
Explorations in ACT-R Based Cognitive Modeling — Chunks, Inheritance, Production Matching and Memory in Language Analysis
Ball, Jerry T. (Air Force Research Laboratory)
According to Baddeley, "The episodic buffer is assumed to be a limitedcapacity Our research team has been working on the development of a language analysis model (Ball, 2011; Ball, Heiberg & temporary storage system that is capable of Silber, 2007) within the ACT-R cognitive architecture integrating information from a variety of sources…the (Anderson, 2007) since 2002 (Ball, 2004). The focus is on buffer provides not only a mechanism for modeling the development of a general-purpose, large-scale, functional environment, but also for creating new cognitive model (Ball, 2008; Ball et al., 2010) that adheres to well representations" (ibid, p. 421). A key empirical result which established cognitive constraints on human language motivated Baddeley to introduce the episodic buffer after 25 processing (HLP) as realized by ACT-R.
A New Approach to Ranking Over-Generated Questions
McConnell, Claire Cooper (University of Pennsylvania) | Mannem, Prashanth ( International Institute of Information Technology ) | Prasad, Rashmi ( University of Wisconsin-Milwaukee ) | Joshi, Aravind (University of Pennsylvania)
We discuss several improvements to the Question Generation Shared Task Evaluation Challenge (QGSTEC) system developed at the University of Pennsylvania in 2010. In addition to enhancing the question generation rules, we have implemented two new components to improve the ranking process. We use topic scoring, a technique developed for summarization, to identify important information for questioning, and language model probabilities to measure grammaticality. Preliminary experiments show that our approach is feasible.
Towards a Model of Question Generation for Promoting Creativity in Novice Writers
Goth, Julius (North Carolina State University)
Automated question generation has been explored for a broad range of tasks. However, an important task for which limited work on question generation has been undertaken is writing support. Writing support systems, particularly for novice writers who are acquiring the fundamentals of writing, can scaffold the complex processes that bear on writing. Novice writers face significant challenges in creative writing. Their stories often lack the expressive prose that characterizes texts produced by their expert writer counterparts. A story that is composed by a novice writer may also lack a compelling plot, may not effectively utilize a story’s setting, characters, and props, and may describe events that play out in an unpredictable or confusing order. We propose an automatic question generation framework that is designed to stimulate the cognitive processes associated with creative writing. The framework utilizes semantic role labeling and discourse parsing applied to the initial drafts of the writer’s passage to generate questions to promote creativity.
The Strong Story Hypothesis and the Directed Perception Hypothesis
Winston, Patrick Henry (Massachusetts Institute of Technology)
I ask why humans are smarter than other primates, and I hypothesize that an important part of the answer lies in what I call the Strong Story Hypothesis, which holds that story telling and understanding have a central role in human intelligence. Next, I introduce another hypothesis, the Driven Perception Hypothesis, which holds that we derive much of our commonsense, including the commonsense required in story understanding, by deploying our perceptual apparatus on real and imagined events. Then, after discussing methodology, I describe the representations and methods embodied in the Genesis system, a story-understanding system that analyzes stories ranging from precis of Shakespeare's plots to descriptions of conflicts in cyberspace. The Genesis system works with short story summaries, provided in English, together with low-level commonsense rules and higher-level reflection patterns, likewise expressed in English. Using only a small collection of commonsense rules and reflection patterns, Genesis demonstrates several story-understanding capabilities, such as determining that both Macbeth and the 2007 Russia-Estonia Cyberwar involve revenge, even though neither the word revenge nor any of its synonyms are mentioned. Finally, I describe Rao's Visio-Spatial Reasoning System, a system that recognizes activities such as approaching, jumping, and giving, and answers commonsense questions posed by Genesis.
The Location of Words: Evidence from Generation and Spatial Description
McDonald, David D. (Smart Information Flow Technologies (SIFT))
Language processing architectures today are rarely designed to provide psychologically plausible accounts of their representations and algorithms. Engineering decisions dominate. This has led to words being seen as an incidental part of the architecture: the repository of all of language’s idiosyncratic aspects. Drawing on a body of past and ongoing research by myself and others I have concluded that this view of words is wrong. Words are actually present at the most abstract, pre-linguistic levels of the NLP architecture and that there are phenomena in language use that are best accounted for by assuming that concepts are words.
The Story Workbench: An Extensible Semi-Automatic Text Annotation Tool
Finlayson, Mark Alan (Massachusetts Institute of Technology)
Text annotations are of great use to researchers in the language sciences, and much effort has been invested in creating annotated corpora for an wide variety of purposes. Unfortunately, software support for these corpora tends to be quite limited: it is usually ad-hoc, poorly designed and documented, or not released for public use. I describe an annotation tool, the Story Workbench, which provides a generic platform for text annotation. It is free, open-source, cross-platform, and user friendly. It provides a number of common text annotation operations, including representations (e.g., tokens, sentences, parts of speech), functions (e.g., generation of initial annotations by algorithm, checking annotation validity by rule, fully manual manipulation of annotations) and tools (e.g., distributing texts to annotators via version control, merging doubly-annotated texts into a single file). The tool is extensible at many different levels, admitting new representations, algorithm, and tools. I enumerate ten important features and illustrate how they support the annotation process at three levels: (1) annotation of individual texts by a single annotator, (2) double-annotation of texts by two annotators and an adjudicator, and (3) annotation scheme development. The Story Workbench is scheduled for public release in March 2012.