A Language for Function Signature Representations

Richardson, Kyle

arXiv.org Artificial Intelligence 

Recent work in natural language processing has looked at learning text to code translation models using parallel pairs of text and code samples from example source code libraries (for a review, see Neubig (2016)). In particular, Richardson and Kuhn (2017a,b); Richardson et al. (2018) look at learning to translate short text descriptions to function signature representations as a first step towards modeling the semantics of function documentation. Examples pairs of docstring and function signature representations are shown in Figure 1; using such pairs, the goal is to learn a general model that can robustly translate a given description of a function to a formal representation of that function. Initially, these datasets were proposed as a synthetic resource for studying semantic parser induction (Mooney, 2007), or for building models that learn to translate text to formal meaning representations from parallel data (see Richardson et al. (2017) for a proposal on using these datasets for the inverse problem of data-to-text generation). To date, we have built around 45 API datasets across 11 popular programming languages (e.g., Python, Java, C, Scheme, Haskell, PHP) and 7 natural languages (see Richardson (2017)), each using an ad hoc rendering of the target function signature representations. In this brief note, we define a unified syntax for expressing these representations, as well as a systematic mapping into first-order logic and a small subject domain model. In doing this, we aim to answer the following question: what do these function signatures that are being learned actually mean, and how can they be used for solving more complex natural language understanding problems (for a similar idea, see Bos (2016))? By recasting the learned representations in terms of classical logic, the hope is that our datasets will in particular be made more accessible to studies on natural language based program synthesis (Raza et al., 2015) and natural language programming more generally. In what follows, we first define a general syntax for these representations, then discuss the mapping into logic and the various applications that motivate our particular approach and subject domain model.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found