Grammars & Parsing
A Language for Function Signature Representations
Recent work in natural language processing has looked at learning text to code translation models using parallel pairs of text and code samples from example source code libraries (for a review, see Neubig (2016)). In particular, Richardson and Kuhn (2017a,b); Richardson et al. (2018) look at learning to translate short text descriptions to function signature representations as a first step towards modeling the semantics of function documentation. Examples pairs of docstring and function signature representations are shown in Figure 1; using such pairs, the goal is to learn a general model that can robustly translate a given description of a function to a formal representation of that function. Initially, these datasets were proposed as a synthetic resource for studying semantic parser induction (Mooney, 2007), or for building models that learn to translate text to formal meaning representations from parallel data (see Richardson et al. (2017) for a proposal on using these datasets for the inverse problem of data-to-text generation). To date, we have built around 45 API datasets across 11 popular programming languages (e.g., Python, Java, C, Scheme, Haskell, PHP) and 7 natural languages (see Richardson (2017)), each using an ad hoc rendering of the target function signature representations. In this brief note, we define a unified syntax for expressing these representations, as well as a systematic mapping into first-order logic and a small subject domain model. In doing this, we aim to answer the following question: what do these function signatures that are being learned actually mean, and how can they be used for solving more complex natural language understanding problems (for a similar idea, see Bos (2016))? By recasting the learned representations in terms of classical logic, the hope is that our datasets will in particular be made more accessible to studies on natural language based program synthesis (Raza et al., 2015) and natural language programming more generally. In what follows, we first define a general syntax for these representations, then discuss the mapping into logic and the various applications that motivate our particular approach and subject domain model.
Reference-less Measure of Faithfulness for Grammatical Error Correction
Evaluation in Monolingual Translation, and particularly in Grammatical Error Correction (GEC) is a challenging research field, much due to the difficulty in integrating different types of rewriting operations into a single measure, and the vast number of valid outputs (Tetreault and Chodorow, 2008; Madnani et al., 2011; Chodorow et al., 2012; Bryant and Ng, 2015). These difficulties have recently motivated a number of proposals for new, improved reference-based measures (RBMs) (Dahlmeier and Ng, 2012; Felice and Briscoe, 2015; Napoles et al., 2015). Nevertheless, the size and heterogeneity of the space of valid outputs per sentence often prohibits obtaining a reference set that covers this space well, thereby limiting the applicability of RBMs (Bryant and Ng, 2015).
Coding-theorem Like Behaviour and Emergence of the Universal Distribution from Resource-bounded Algorithmic Probability
Zenil, Hector, Badillo, Liliana, Hernรกndez-Orozco, Santiago, Hernรกndez-Quiroz, Francisco
Previously referred to as `miraculous' in the scientific literature because of its powerful properties and its wide application as optimal solution to the problem of induction/inference, (approximations to) Algorithmic Probability (AP) and the associated Universal Distribution are (or should be) of the greatest importance in science. Here we investigate the emergence, the rates of emergence and convergence, and the Coding-theorem like behaviour of AP in Turing-subuniversal models of computation. We investigate empirical distributions of computing models in the Chomsky hierarchy. We introduce measures of algorithmic probability and algorithmic complexity based upon resource-bounded computation, in contrast to previously thoroughly investigated distributions produced from the output distribution of Turing machines. This approach allows for numerical approximations to algorithmic (Kolmogorov-Chaitin) complexity-based estimations at each of the levels of a computational hierarchy. We demonstrate that all these estimations are correlated in rank and that they converge both in rank and values as a function of computational power, despite fundamental differences between computational models. In the context of natural processes that operate below the Turing universal level because of finite resources and physical degradation, the investigation of natural biases stemming from algorithmic rules may shed light on the distribution of outcomes. We show that up to 60\% of the simplicity/complexity bias in distributions produced even by the weakest of the computational models can be accounted for by Algorithmic Probability in its approximation to the Universal Distribution.
Solving Bongard Problems with a Visual Language and Pragmatic Reasoning
Depeweg, Stefan, Rothkopf, Constantin A., Jรคkel, Frank
More than 50 years ago Bongard introduced 100 visual concept learning problems as a testbed for intelligent vision systems. These problems are now known as Bongard problems. Although they are well known in the cognitive science and AI communities only moderate progress has been made towards building systems that can solve a substantial subset of them. In the system presented here, visual features are extracted through image processing and then translated into a symbolic visual vocabulary. We introduce a formal language that allows representing complex visual concepts based on this vocabulary. Using this language and Bayesian inference, complex visual concepts can be induced from the examples that are provided in each Bongard problem. Contrary to other concept learning problems the examples from which concepts are induced are not random in Bongard problems, instead they are carefully chosen to communicate the concept, hence requiring pragmatic reasoning. Taking pragmatic reasoning into account we find good agreement between the concepts with high posterior probability and the solutions formulated by Bongard himself. While this approach is far from solving all Bongard problems, it solves the biggest fraction yet.
Keisuke Sakaguchi: Robust Text Correction for Grammar and Fluency
Keisuke Sakaguchi Title: "Robust Text Correction for Grammar and Fluency" Abstract: Robustness has always been a desirable property for natural language processing. In many cases, NLP models (e.g., parsing) and downstream applications (e.g., MT) perform poorly when the input contains noise such as spelling errors, grammatical errors, and disfluency. In this talk, I will present three recent results on error correction models: character, word, and sentence level respectively. For character level, I propose semi-character recurrent neural network, which is motivated by a finding in Psycholinguistics, called Cmabrigde Uinervtisy (Cambridge University) effect. For word-level robustness, I propose an error-repair dependency parsing algorithm for ungrammatical texts.
code2vec: Learning Distributed Representations of Code
Alon, Uri, Zilberstein, Meital, Levy, Omer, Yahav, Eran
We present a neural model for representing snippets of code as continuous distributed vectors. The main idea is to represent code as a collection of paths in its abstract syntax tree, and aggregate these paths, in a smart and scalable way, into a single fixed-length \emph{code vector}, which can be used to predict semantic properties of the snippet. We demonstrate the effectiveness of our approach by using it to predict a method's name from the vector representation of its body. We evaluate our approach by training a model on a dataset of $14$M methods. We show that code vectors trained on this dataset can predict method names from files that were completely unobserved during training. Furthermore, we show that our model learns useful method name vectors that capture semantic similarities, combinations, and analogies. Comparing previous techniques over the same data set, our approach obtains a relative improvement of over $75\%$, being the first to successfully predict method names based on a large, cross-project, corpus.
Emma Watson jokes she needs proofreader after she debuts feminist tattoo with grammatical error
Emma Watson jokes she needs a proofreader after a tattoo she debuted had a glaring grammatical error. Emma Watson joked she needed a proofreader after she debuted some new ink at the Vanity Fair Oscar Party that contained a glaring grammatical error. The "Harry Potter" star showed off a tattoo that read "Times Up" on her arm -- clearly missing the apostrophe for the organization Time's Up. Watson's tattoo was apparently a sign of support for the movement but social media users quickly pointed out the phrase was missing an apostrophe. At first, it was not immediately clear if Watson's tattoo was real or temporary.
Emma Watson displays Times Up tattoo at Vanity Fair Oscar party but social media users point out grammatical error
Emma Watson displayed some new ink at the Vanity Fair Oscar Party but social media users pointed out the tattoo's glaring grammatical error. The "Harry Potter" star showed off a tattoo that read "Times Up" on her arm -- clearly missing the apostrophe for the organization Time's Up. It was not immediately clear if the tattoo was real. People reported the tattoo could be be temporary. The Brown University graduate has been an outspoken proponent of the Time's Up movement, which began after bombshell exposรฉs revealed decades of alleged sexual misconduct by Hollywood producer Harvey Weinstein.
Cognitive Science in the era of Artificial Intelligence: A roadmap for reverse-engineering the infant language-learner
During their first years of life, infants learn the language(s) of their environment at an amazing speed despite large cross cultural variations in amount and complexity of the available language input. Understanding this simple fact still escapes current cognitive and linguistic theories. Recently, spectacular progress in the engineering science, notably, machine learning and wearable technology, offer the promise of revolutionizing the study of cognitive development. Machine learning offers powerful learning algorithms that can achieve human-like performance on many linguistic tasks. Wearable sensors can capture vast amounts of data, which enable the reconstruction of the sensory experience of infants in their natural environment. The project of 'reverse engineering' language development, i.e., of building an effective system that mimics infant's achievements appears therefore to be within reach. Here, we analyze the conditions under which such a project can contribute to our scientific understanding of early language development. We argue that instead of defining a sub-problem or simplifying the data, computational models should address the full complexity of the learning situation, and take as input the raw sensory signals available to infants. This implies that (1) accessible but privacy-preserving repositories of home data be setup and widely shared, and (2) models be evaluated at different linguistic levels through a benchmark of psycholinguist tests that can be passed by machines and humans alike, (3) linguistically and psychologically plausible learning architectures be scaled up to real data using probabilistic/optimization principles from machine learning. We discuss the feasibility of this approach and present preliminary results.
Progressive Cognitive Human Parsing
Zhu, Bingke (Institute of Automation, Chinese Academy of Sciences) | Chen, Yingying (Institute of Automation, Chinese Academy of Sciences) | Tang, Ming (Institute of Automation, Chinese Academy of Sciences) | Wang, Jinqiao (Institute of Automation, Chinese Academy of Sciences)
Human parsing is an important task for human-centric understanding. Generally, two mainstreams are used to deal with this challenging and fundamental problem. The first one is employing extra human pose information to generate hierarchical parse graph to deal with human parsing task. Another one is training an end-to-end network with the semantic information in image level. In this paper, we develop an end-to-end progressive cognitive network to segment human parts. In order to establish a hierarchical relationship, a novel component-aware region convolution structure is proposed. With this structure, latter layers inherit prior component information from former layers and pay its attention to a finer component. In this way, we deal with human parsing as a progressive recognition task, that is, we first locate the whole human and then segment the hierarchical components gradually. The experiments indicate that our method has a better location capacity for the small objects and a better classification capacity for the large objects. Moreover, our framework can be embedded into any fully convolutional network to enhance the performance significantly.