Goto

Collaborating Authors

 Zinkov, Robert


VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree Search

arXiv.org Artificial Intelligence

Large Language Models (LLMs) can generate useful code, but often the code they generate cannot be trusted to be sound. In this paper, we present VerMCTS, an approach to begin to resolve this issue by generating verified programs in Dafny and Coq. VerMCTS uses a logical verifier in concert with an LLM to guide a modified Monte Carlo Tree Search (MCTS). This approach leverages the verifier to gain intermediate feedback inside the search algorithm by checking partial programs at each step to estimate an upper bound on the value function. To measure the performance of VerMCTS, we develop a new suite of multi-step verified programming problems in Dafny and Coq. In terms of pass@T, a new metric which computes the pass rate given a budget of T tokens sampled from the LLM, VerMCTS leads to more than a 30% absolute increase in average pass@5000 across the suite over repeated sampling from the base language model.


Amortized Rejection Sampling in Universal Probabilistic Programming

arXiv.org Artificial Intelligence

Existing approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. An instance of this is importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. In this paper we develop a new and efficient amortized importance sampling estimator. We prove finite variance of our estimator and empirically demonstrate our method's correctness and efficiency compared to existing alternatives on generative programs containing rejection sampling loops and discuss how to implement our method in a generic probabilistic programming framework.


Minimally Faithful Inversion of Graphical Models

arXiv.org Machine Learning

Inference amortization methods allow the sharing of statistical strength across related observations when learning to perform posterior inference. Generally this requires the inversion of the dependency structure in the generative model, as the modeller must design and learn a distribution to approximate the posterior. Previous methods invert the dependency structure in a heuristic way and fail to capture the dependencies in the model, therefore limiting the performance of the eventual inference algorithm. We introduce an algorithm for faithfully and minimally inverting the graphical model structure of any generative model. Such an inversion has two crucial properties: a) it does not encode any independence assertions absent from the model, and b) for a given inversion, it encodes as many true independence assertions as possible. Our algorithm works by simulating variable elimination on the generative model to reparametrize the distribution. We show with experiments how such minimal inversions can assist in performing better inference.


Composing inference algorithms as program transformations

arXiv.org Artificial Intelligence

Probabilistic inference procedures are usually coded painstakingly from scratch, for each target model and each inference algorithm. We reduce this effort by generating inference procedures from models automatically. We make this code generation modular by decomposing inference algorithms into reusable program-to-program transformations. These transformations perform exact inference as well as generate probabilistic programs that compute expectations, densities, and MCMC samples. The resulting inference procedures are about as accurate and fast as other probabilistic programming systems on real-world problems.


Using Synthetic Data to Train Neural Networks is Model-Based Reasoning

arXiv.org Machine Learning

We draw a formal connection between using synthetic training data to optimize neural network parameters and approximate, Bayesian, model-based reasoning. In particular, training a neural network using synthetic data can be viewed as learning a proposal distribution generator for approximate inference in the synthetic-data generative model. We demonstrate this connection in a recognition task where we develop a novel Captcha-breaking architecture and train it using synthetic data, demonstrating both state-of-the-art performance and a way of computing task-specific posterior uncertainty. Using a neural network trained this way, we also demonstrate successful breaking of real-world Captchas currently used by Facebook and Wikipedia. Reasoning from these empirical results and drawing connections with Bayesian modeling, we discuss the robustness of synthetic data results and suggest important considerations for ensuring good neural network generalization when training with synthetic data.