snis
Amortized Sampling with Transferable Normalizing Flows
Efficient equilibrium sampling of molecular conformations remains a core challenge in computational chemistry and statistical inference. Classical approaches such as molecular dynamics or Markov chain Monte Carlo inherently lack amortization; the computational cost of sampling must be paid in full for each system of interest. The widespread success of generative models has inspired interest towards overcoming this limitation through learning sampling algorithms. Despite performing competitively with conventional methods when trained on a single system, learned samplers have so far demonstrated limited ability to transfer across systems. We demonstrate that deep learning enables the design of scalable and transferable samplers by introducing PROSE, a 285 million parameter all-atom transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length. PROSE draws zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving the previously intractable transferability across sequence length, whilst retaining the efficient likelihood evaluation of normalizing flows. Through extensive empirical evaluation we demonstrate the efficacy of PROSE as a proposal for a variety of sampling algorithms, finding a simple importance sampling-based fine-tuning procedure to achieve competitive performance to established methods such as sequential Monte Carlo. We open-source the PROSE codebase, model weights, and training dataset, to further stimulate research into amortized sampling methods and objectives.
Energy-InspiredModels: Learningwith Sampler-InducedDistributions
This yields a class ofenergy-inspired models(EIMs) that incorporate learned energyfunctions while stillproviding exactsamples andtractable log-likelihood lower bounds. We describe and evaluate three instantiations of such models based ontruncated rejection sampling, self-normalized importance sampling, and Hamiltonian importance sampling.
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible. The problem's importance has attracted many proposed solutions, including importance sampling (IS), self-normalized IS (SNIS), and doubly robust (DR) estimates. DR and its variants ensure semiparametric local efficiency if Q-functions are well-specified, but if they are not they can be worse than both IS and SNIS. It also does not enjoy SNIS's inherent stability and boundedness. We propose new estimators for OPE based on empirical likelihood that are always more efficient than IS, SNIS, and DR and satisfy the same stability and boundedness properties as SNIS. On the way, we categorize various properties and classify existing estimators by them. Besides the theoretical guarantees, empirical studies suggest the new estimators provide advantages.
Optimality in importance sampling: a gentle survey
Llorente, Fernando, Martino, Luca
Monte Carlo (MC) methods are powerful tools for numerical inference and optimization widely employed in statistics, signal processing and machine learning Liu (2004); Robert and Casella (2004). They are mainly used for computing approximately the solution of definite integrals, and by extension, of differential equations (for this reason, MC schemes can be considered stochastic quadrature rules). Although exact analytical solutions to integrals are always desirable, such unicorns are rarely available, specially in real-world systems. Many applications inevitably require the approximation of intractable integrals. Specifically, Bayesian methods need the computation of expectations with respect to posterior probability density function (pdf) which, generally, are analytically intractable Gelman et al. (2013). The MC methods can be divided in four main families: direct methods (based on transformations or random variables), accept-reject techniques, Markov chain Monte Carlo (MCMC) algorithms, and importance sampling (IS) schemes Luengo et al. (2020); Martino et al. (2018). The last two families are the most popular for the facility and universality of their possible application Liang et al. (2010); Liu (2004); Robert and Casella (2004). All the MC methods require the choice of a suitable proposal density that is crucial for their performance Luengo et al. (2020); Robert and Casella (2004).
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible. The problem's importance has attracted many proposed solutions, including importance sampling (IS), self-normalized IS (SNIS), and doubly robust (DR) estimates. DR and its variants ensure semiparametric local efficiency if Q-functions are well-specified, but if they are not they can be worse than both IS and SNIS. It also does not enjoy SNIS's inherent stability and boundedness. We propose new estimators for OPE based on empirical likelihood that are always more efficient than IS, SNIS, and DR and satisfy the same stability and boundedness properties as SNIS.
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Kallus, Nathan, Uehara, Masatoshi
Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible. The problem's importance has attracted many proposed solutions, including importance sampling (IS), self-normalized IS (SNIS), and doubly robust (DR) estimates. DR and its variants ensure semiparametric local efficiency if Q-functions are well-specified, but if they are not they can be worse than both IS and SNIS. It also does not enjoy SNIS's inherent stability and boundedness. We propose new estimators for OPE based on empirical likelihood that are always more efficient than IS, SNIS, and DR and satisfy the same stability and boundedness properties as SNIS. On the way, we categorize various properties and classify existing estimators by them.
Energy-Inspired Models: Learning with Sampler-Induced Distributions
Lawson, Dieterich, Tucker, George, Dai, Bo, Ranganath, Rajesh
Energy-based models (EBMs) are powerful probabilistic models, but suffer from intractable sampling and density evaluation due to the partition function. As a result, inference in EBMs relies on approximate sampling algorithms, leading to a mismatch between the model and inference. Motivated by this, we consider the sampler-induced distribution as the model of interest and maximize the likelihood of this model. This yields a class of energy-inspired models (EIMs) that incorporate learned energy functions while still providing exact samples and tractable log-likelihood lower bounds. We describe and evaluate three instantiations of such models based on truncated rejection sampling, self-normalized importance sampling, and Hamiltonian importance sampling. These models outperform or perform comparably to the recently proposed Learned Accept/Reject Sampling algorithm and provide new insights on ranking Noise Contrastive Estimation and Contrastive Predictive Coding. Moreover, EIMs allow us to generalize a recent connection between multi-sample variational lower bounds and auxiliary variable variational inference. We show how recent variational bounds can be unified with EIMs as the variational family.