Grammars & Parsing
Grammar Variational Autoencoder
Kusner, Matt J., Paige, Brooks, Hernández-Lobato, José Miguel
Deep generative models have been wildly successful at learning coherent latent representations for continuous data such as video and audio. However, generative modeling of discrete data such as arithmetic expressions and molecular structures still poses significant challenges. Crucially, state-of-the-art methods often produce outputs that are not valid. We make the key observation that frequently, discrete data can be represented as a parse tree from a context-free grammar. We propose a variational autoencoder which encodes and decodes directly to and from these parse trees, ensuring the generated outputs are always valid. Surprisingly, we show that not only does our model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs. We demonstrate the effectiveness of our learned models by showing their improved performance in Bayesian optimization for symbolic regression and molecular synthesis.
Essential Arts & Culture: Parsing Measure S, 'Fun Home' inspires genuflection, SCI-Arc goes to Mexico
The award-winning show inspired by a singular graphic memoir. Plus: SCI-Arc in Mexico City, Oscar-nominated films that emerged from important plays, and a longtime curator leaves the downtown gallery he helped establish. I'm Carolina A. Miranda, staff writer for the Los Angeles, and I'm in your inbox with a weekly digest of everything culture: On March 7, Los Angeles will head to the polls to vote on a development measure that could affect the profile of the city. Measure S (formerly known as the Neighborhood Integrity Initiative) seeks to put a two-year moratorium on development projects that require an amendment to the city's general plan, among other factors. Times architecture critic Christopher Hawthorne parses the measure and its backers, whose roots lie in anti-growth initiatives from the 1980s -- and whose vision of Los Angeles seems to lie squarely in the 1960s.
A Dependency-Based Neural Reordering Model for Statistical Machine Translation
Hadiwinoto, Christian (National University of Singapore) | Ng, Hwee Tou (National University of Singapore)
In machine translation (MT) that involves translating between two languages with significant differences in word order, determining the correct word order of translated words is a major challenge. The dependency parse tree of a source sentence can help to determine the correct word order of the translated words. In this paper, we present a novel reordering approach utilizing a neural network and dependency-based embeddings to predict whether the translations of two source words linked by a dependency relation should remain in the same order or should be swapped in the translated sentence. Experiments on Chinese-to-English translation show that our approach yields a statistically significant improvement of 0.57 BLEU point on benchmark NIST test sets, compared to our prior state-of-the-art statistical MT system that uses sparse dependency-based reordering features.
Natural Language Dialogue for Building and Learning Models and Structures
Perera, Ian (Institute for Human and Machine Cognition) | Allen, James F. (Institute for Human and Machine Cognition and University of Rochester) | Galescu, Lucian (Institute for Human and Machine Cognition) | Teng, Choh Man (Institute for Human and Machine Cognition) | Burstein, Mark (SIFT) | Friedman, Scott (SIFT) | McDonald, David (SIFT) | Rye, Jeffrey (SIFT)
We demonstrate an integrated system for building and learning models and structures in both a real and virtual environment. The system combines natural language understanding, planning, and methods for composition of basic concepts into more complicated concepts. The user and the system interact via natural language to jointly plan and execute tasks involving building structures, with clarifications and demonstrations to teach the system along the way. We use the same architecture for building and simulating models of biology, demonstrating the general-purpose nature of the system where domain-specific knowledge is concentrated in sub-modules with the basic interaction remaining domain-independent. These capabilities are supported by our work on semantic parsing, which generates knowledge structures to be grounded in a physical representation, and composed with existing knowledge to create a dynamic plan for completing goals. Prior work on learning from natural language demonstrations enables learning of models from very few demonstrations, and features are extracted from definitions in natural language. We believe this architecture for interaction opens up a wide possibility of human-computer interaction and knowledge transfer through natural language.
Semantic Parsing with Neural Hybrid Trees
Susanto, Raymond Hendy (Singapore University of Technology and Design) | Lu, Wei (Singapore University of Technology and Design)
We propose a neural graphical model for parsing natural language sentences into their logical representations. The graphical model is based on hybrid tree structures that jointly represent both sentences and semantics. Learning and decoding are done using efficient dynamic programming algorithms. The model is trained under a discriminative setting, which allows us to incorporate a rich set of features. Hybrid tree structures have shown to achieve state-of-the-art results on standard semantic parsing datasets. In this work, we propose a novel model that incorporates a rich, nonlinear featurization by a feedforward neural network. The error signals are computed with respect to the conditional random fields (CRFs) objective using an inside-outside algorithm, which are then backpropagated to the neural network. We demonstrate that by combining the strengths of the exact global inference in the hybrid tree models and the power of neural networks to extract high level features, our model is able to achieve new state-of-the-art results on standard benchmark datasets across different languages.
Efficient Clinical Concept Extraction in Electronic Medical Records
Guo, Yufan (IBM Research - Almaden) | Kakrania, Deepika (IBM Research - Almaden) | Baldwin, Tyler (IBM Research - Almaden) | Syeda-Mahmood, Tanveer (IBM Research - Almaden)
Automatic identification of clinical concepts in electronic medical records (EMR) is useful not only in forming a complete longitudinal health record of patients, but also in recovering missing codes for billing, reducing costs, finding more accurate clinical cohorts for clinical trials, and enabling better clinical decision support. Existing systems for clinical concept extraction are mostly knowledge-driven, relying on exact match retrieval from original or lemmatized reports, and very few of them are scaled up to handle large volumes of complex, diverse data. In this demonstration we will showcase a new system for real-time detection of clinical concepts in EMR. The system features a large vocabulary of over 5.6 million concepts. It achieves high precision and recall, with good tolerance to typos through the use of a novel prefix indexing and subsequence matching algorithm, along with a recursive negation detector based on efficient, deep parsing. Our system has been tested on over 12.9 million reports of more than 200 different types, collected from 800,000+ patients. A comparison with the state of the art shows that it outperforms previous systems in addition to being the first system to scale to such large collections.
Semantic Proto-Role Labeling
Teichert, Adam (Johns Hopkins University) | Poliak, Adam (Johns Hopkins University) | Durme, Benjamin Van (Johns Hopkins University) | Gormley, Matthew R. (Carnegie Mellon University)
The semantic function tags of Bonial, Stowe, and Palmer (2013) and the ordinal, multi-property annotations of Reisinger et al. (2015) draw inspiration from Ddowty's semantic proto-role theory. We approach proto-role labeling as a multi-label classification problem and establish strong results for the task by adapting a successful model of traditional semantic role labeling. We achieve a proto-role micro-averaged F1 of 81.7 using gold syntax and explore joint and conditional models of proto-roles and categorical roles. In comparing the effect of Bonial, Stowe, and Palmer's tags to PropBank ArgN-style role labels, we are surprised that neither annotations greatly improve proto-role prediction; however, we observe that ArgN models benefit much from observed syntax and from observed or modeled proto-roles while our models of the semantic function tags do not.
Natural Language Acquisition and Grounding for Embodied Robotic Systems
Alomari, Muhannad (University of Leeds) | Duckworth, Paul (University of Leeds) | Hogg, David C. (University of Leeds) | Cohn, Anthony G. (University of Leeds)
We present a cognitively plausible novel framework capable of learning the grounding in visual semantics and the grammar of natural language commands given to a robot in a table top environment. The input to the system consists of video clips of a manually controlled robot arm, paired with natural language commands describing the action. No prior knowledge is assumed about the meaning of words, or the structure of the language, except that there are different classes of words (corresponding to observable actions, spatial relations, and objects and their observable properties). The learning process automatically clusters the continuous perceptual spaces into concepts corresponding to linguistic input. A novel relational graph representation is used to build connections between language and vision. As well as the grounding of language to perception, the system also induces a set of probabilistic grammar rules. The knowledge learned is used to parse new commands involving previously unseen objects.
Cross-View People Tracking by Scene-Centered Spatio-Temporal Parsing
Xu, Yuanlu (University of California, Los Angeles) | Liu, Xiaobai (San Diego State University) | Qin, Lei (Chinese Academy of Sciences) | Zhu, Song-Chun (University of California, Los Angeles)
In this paper, we propose a Spatio-temporal Attributed Parse Graph (ST-APG) to integrate semantic attributes with trajectories for cross-view people tracking. Given videos from multiple cameras with overlapping field of view (FOV), our goal is to parse the videos and organize the trajectories of all targets into a scene-centered representation. We leverage rich semantic attributes of human, e.g., facing directions, postures and actions, to enhance cross-view tracklet associations, besides frequently used appearance and geometry features in the literature.In particular, the facing direction of a human in 3D, once detected, often coincides with his/her moving direction or trajectory. Similarly, the actions of humans, once recognized, provide strong cues for distinguishing one subject from the others. The inference is solved by iteratively grouping tracklets with cluster sampling and estimating people semantic attributes by dynamic programming.In experiments, we validate our method on one public dataset and create another new dataset that records people's daily life in public, e.g., food court, office reception and plaza, each of which includes 3-4 cameras. We evaluate the proposed method on these challenging videos and achieve promising multi-view tracking results.
Multi-Path Feedback Recurrent Neural Networks for Scene Parsing
Jin, Xiaojie (National University of Singapore) | Chen, Yunpeng (National University of Singapore) | Jie, Zequn (National University of Singapore) | Feng, Jiashi (National University of Singapore) | Yan, Shuicheng (National University of Singapore)
In this paper, we consider the scene parsing problem and propose a novel Multi-Path Feedback recurrent neural network (MPF-RNN) for parsing scene images. MPF-RNN can enhance the capability of RNNs in modeling long-range context information at multiple levels and better distinguish pixels that are easy to confuse. Different from feedforward CNNs and RNNs with only single feedback, MPF-RNN propagates the contextual features learned at top layer through multiple weighted recurrent connections to learn bottom features. For better training MPF-RNN, we propose a new strategy that considers accumulative loss at multiple recurrent steps to improve performance of the MPF-RNN on parsing small objects. With these two novel components, MPF-RNN has achieved significant improvement over strong baselines (VGG16 and Res101) on five challenging scene parsing benchmarks, including traditional SiftFlow, Barcelona, CamVid, Stanford Background as well as the recently released large-scale ADE20K.