Grammars & Parsing
Using Closed Captions as Supervision for Video Activity Recognition
Gupta, Sonal (Stanford University) | Mooney, Raymond J. (University of Texas at Austin)
Recognizing activities in real-world videos is a difficult problem exacerbated by background clutter, changes in camera angle & zoom, and rapid camera movements. Large corpora of labeled videos can be used to train automated activity recognition systems, but this requires expensive human labor and time. This paper explores how closed captions that naturally accompany many videos can act as weak supervision that allows automatically collecting "labeled" data for activity recognition. We show that such an approach can improve activity retrieval in soccer videos. Our system requires no manual labeling of video clips and needs minimal human supervision. We also present a novel caption classifier that uses additional linguistic information to determine whether a specific comment refers to an ongoing activity. We demonstrate that combining linguistic analysis and automatically trained activity recognizers can significantly improve the precision of video retrieval.
Forest-Based Semantic Role Labeling
Xiong, Hao (Chinese Academy of Sciences) | Mi, Haitao (Chinese Academy of Sciences) | Liu, Yang (Chinese Academy of Sciences) | Liu, Qun (Chinese Academy of Sciences)
Parsing plays an important role in semantic role labeling (SRL) because most SRL systems infer semantic relations from 1-best parses. Therefore, parsing errors inevitably lead to labeling mistakes. To alleviate this problem, we propose to use packed forest, which compactly encodes all parses for a sentence. We design an algorithm to exploit exponentially many parses to learn semantic relations efciently. Experimental results on the CoNLL-2005 shared task show that using forests achieves an absolute improvement of 1.2% in terms of F1 score over using 1-best parses and 0.6% over using 50-best parses.
Machine Reading: A "Killer App" for Statistical Relational AI
Poon, Hoifung (University of Washington) | Domingos, Pedro (University of Washington)
Machine reading aims to automatically extract knowledge from text. It is a long-standing goal of AI and holds the promise of revolutionizing Web search and other fields. In this paper, we analyze the core challenges of machine reading and show that statistical relational AI is particularly well suited to address these challenges. We then propose a unifying approach to machine reading in which statistical relational AI plays a central role. Finally, we demonstrate the promise of this approach by presenting OntoUSP, an end-to-end machine reading system that builds on recent advances in statistical relational AI and greatly outperforms state-of-the-art systems in a task of extracting knowledge from biomedical abstracts and answering questions.
Learning Probabilistic Hierarchical Task Networks to Capture User Preferences
Li, Nan, Cushing, William, Kambhampati, Subbarao, Yoon, Sungwook
We propose automatically learning probabilistic Hierarchical Task Networks (pH-TNs) in order to capture a user's preferences on plans, by observing only the user's behavior. HTNs are a common choice of representation for a variety of purposes in planning, including work on learning in planning. Our contributions are (a) learning structure and (b) representing preferences. In contrast, prior work employing HTNs considers learning method preconditions (instead of structure) and representing domain physics or search control knowledge (rather than preferences). Initially we will assume that the observed distribution of plans is an accurate representation of user preference, and then generalize to the situation where feasibility constraints frequently prevent the execution of preferred plans. In order to learn a distribution on plans we adapt an Expectation-Maximization (EM) technique from the discipline of (probabilistic) grammar induction, taking the perspective of task reductions as productions in a context-free grammar over primitive actions. To account for the difference between the distributions of possible and preferred plans we subsequently modify this core EM technique, in short, by rescaling its input.
Using Local Alignments for Relation Recognition
Katrenko, S., Adriaans, P. W., van Someren, M.
Aiming at accurate recognition of relations, we introduce local alignment kernels and explore various possibilities of using them for this task. We give a definition of a local alignment (LA) kernel based on the Smith-Waterman score as a sequence similarity measure and proceed with a range of possibilities for computing similarity between elements of sequences. We show how distributional similarity measures obtained from unlabeled data can be incorporated into the learning task as semantic knowledge. Our experiments suggest that the LA kernel yields promising results on various biomedical corpora outperforming two baselines by a large margin. Additional series of experiments have been conducted on the data sets of seven general relation types, where the performance of the LA kernel is comparable to the current state-of-the-art results.
How to correctly prune tropical trees
Loddo, Jean-Vincent, Saiu, Luca
We present tropical games, a generalization of combinatorial min-max games based on tropical algebras. Our model breaks the traditional symmetry of rational zero-sum games where players have exactly opposed goals (min vs. max), is more widely applicable than min-max and also supports a form of pruning, despite it being less effective than alpha-beta. Actually, min-max games may be seen as particular cases where both the game and its dual are tropical: when the dual of a tropical game is also tropical, the power of alpha-beta is completely recovered. We formally develop the model and prove that the tropical pruning strategy is correct, then conclude by showing how the problem of approximated parsing can be modeled as a tropical game, profiting from pruning.
Training a Multilingual Sportscaster: Using Perceptual Context to Learn Language
Chen, D. L., Kim, J., Mooney, R. J.
We present a novel framework for learning to interpret and generate language using only perceptual context as supervision. We demonstrate its capabilities by developing a system that learns to sportscast simulated robot soccer games in both English and Korean without any language-specific prior knowledge. Training employs only ambiguous supervision consisting of a stream of descriptive textual comments and a sequence of events extracted from the simulation trace. The system simultaneously establishes correspondences between individual comments and the events that they describe while building a translation model that supports both parsing and generation. We also present a novel algorithm for learning which events are worth describing. Human evaluations of the generated commentaries indicate they are of reasonable quality and in some cases even on par with those produced by humans for our limited domain.
Improving Relevancy Accessing Linked Opinion Data
Galitsky, Boris (University of Girona) | Rosa, Josep Lluis de la (University of Girona) | Dobrocsi, Gรกbor (University of Miskolc)
We introduce a search engine and information retrieval system for providing access to linked opinion data. Natural language technology of generalization of syntactic parse trees is introduced as a similarity measure between subjects of textual opinions to link them on the fly. Information extraction algorithm for automatic summarization of web pages in the format of Google sponsored links is presented. We outline the usability of the implemented system, integrated opinion delivery environment (IODE).
Preprocessing Legal Text: Policy Parsing and Isomorphic Intermediate Representation
Waterman, K. Krasnow (Massachusetts Institute of Technology)
One of the most significant challenges in achieving digital privacy is incorporating privacy policy directly in computer systems. While rule systems have long existed, translating privacy laws, regulations, policies, and contracts into processor amenable forms is slow and difficult because the legal text is scattered, run-on, and unstructured, antithetical to the lean and logical forms of computer science. We are using and developing intermediate isomorphic forms as a Rosetta Stone-like tool to accelerate the translation process and in hopes of providing support to future domain-specific Natural Language Processing technology. This report describes our experience, thoughts about how to improve the form, and discoveries about the form and logic of the legal text that will affect the successful development of a rules tool to implement real-world complex privacy policies.
Syntactic Topic Models
Boyd-Graber, Jordan, Blei, David M.
The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent distributions of words (topics) that are both semantically and syntactically coherent. The STM models dependency parsed corpora where sentences are grouped into documents. It assumes that each word is drawn from a latent topic chosen by combining document-level features and the local syntactic context. Each document has a distribution over latent topics, as in topic models, which provides the semantic consistency. Each element in the dependency parse tree also has a distribution over the topics of its children, as in latent-state syntax models, which provides the syntactic consistency. These distributions are convolved so that the topic of each word is likely under both its document and syntactic context. We derive a fast posterior inference algorithm based on variational methods. We report qualitative and quantitative studies on both synthetic data and hand-parsed documents. We show that the STM is a more predictive model of language than current models based only on syntax or only on topics.