Goto

Collaborating Authors

 Matricon, Théo


Programmatic Reinforcement Learning: Navigating Gridworlds

arXiv.org Artificial Intelligence

The field of reinforcement learning (RL) is concerned with algorithms for learning optimal policies in unknown stochastic environments. Programmatic RL studies representations of policies as programs, meaning involving higher order constructs such as control loops. Despite attracting a lot of attention at the intersection of the machine learning and formal methods communities, very little is known on the theoretical front about programmatic RL: what are good classes of programmatic policies? How large are optimal programmatic policies? How can we learn them? The goal of this paper is to give first answers to these questions, initiating a theoretical study of programmatic RL. Considering a class of gridworld environments, we define a class of programmatic policies. Our main contributions are to place upper bounds on the size of optimal programmatic policies, and to construct an algorithm for synthesizing them. These theoretical findings are complemented by a prototype implementation of the algorithm.


EcoSearch: A Constant-Delay Best-First Search Algorithm for Program Synthesis

arXiv.org Artificial Intelligence

Many approaches to program synthesis perform a combinatorial search within a large space of programs to find one that satisfies a given specification. To tame the search space blowup, previous works introduced probabilistic and neural approaches to guide this combinatorial search by inducing heuristic cost functions. Best-first search algorithms ensure to search in the exact order induced by the cost function, significantly reducing the portion of the program space to be explored. We present a new best-first search algorithm called EcoSearch, which is the first constant-delay algorithm for pre-generation cost function: the amount of compute required between outputting two programs is constant, and in particular does not increase over time. This key property yields important speedups: we observe that EcoSearch outperforms its predecessors on two classic domains.


WikiCoder: Learning to Write Knowledge-Powered Code

arXiv.org Artificial Intelligence

We tackle the problem of automatic generation of computer programs from a few pairs of input-output examples. The starting point of this work is the observation that in many applications a solution program must use external knowledge not present in the examples: we call such programs knowledge-powered since they can refer to information collected from a knowledge graph such as Wikipedia. This paper makes a first step towards knowledge-powered program synthesis. We present WikiCoder, a system building upon state of the art machine-learned program synthesizers and integrating knowledge graphs. We evaluate it to show its wide applicability over different domains and discuss its limitations. WikiCoder solves tasks that no program synthesizers were able to solve before thanks to the use of knowledge graphs, while integrating with recent developments in the field to operate at scale.


Scaling Neural Program Synthesis with Distribution-based Search

arXiv.org Artificial Intelligence

Writing software is tedious, error-prone, and accessible only to a small share of the population - yet coding grows increasingly important as the digital world plays larger and larger roles in peoples' lives. Program synthesis seeks to make coding more reliable and accessible by developing methods for automatically constructing programs Gulwani et al. (2017). For example, the FlashFill system Gulwani et al. (2017) in Microsoft Excel makes coding more accessible by allowing nontechnical users to synthesize spreadsheet programs by giving inputoutput examples, while the TF-coder system Shi et al. (2020a) seeks to make coding neural networks more reliable by synthesizing TensorFlow code from input-outputs. Where these systems have been most successful is when they pair a specialized domain-specific language (DSL) to a domain-specific search algorithm for synthesizing programs in that DSL. A recent trend - both in industry Kalyan et al. (2018) and academia Devlin et al. (2017) - is to employ machine learning methods to learn to quickly search for a program in the DSL Balog et al. (2017); Devlin et al. (2017); Lee et al. (2018); Zhang et al.