snis
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
Energy-InspiredModels: Learningwith Sampler-InducedDistributions
This yields a class ofenergy-inspired models(EIMs) that incorporate learned energyfunctions while stillproviding exactsamples andtractable log-likelihood lower bounds. We describe and evaluate three instantiations of such models based ontruncated rejection sampling, self-normalized importance sampling, and Hamiltonian importance sampling.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible. The problem's importance has attracted many proposed solutions, including importance sampling (IS), self-normalized IS (SNIS), and doubly robust (DR) estimates. DR and its variants ensure semiparametric local efficiency if Q-functions are well-specified, but if they are not they can be worse than both IS and SNIS. It also does not enjoy SNIS's inherent stability and boundedness. We propose new estimators for OPE based on empirical likelihood that are always more efficient than IS, SNIS, and DR and satisfy the same stability and boundedness properties as SNIS. On the way, we categorize various properties and classify existing estimators by them. Besides the theoretical guarantees, empirical studies suggest the new estimators provide advantages.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- North America > United States > Colorado (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Optimality in importance sampling: a gentle survey
Llorente, Fernando, Martino, Luca
Monte Carlo (MC) methods are powerful tools for numerical inference and optimization widely employed in statistics, signal processing and machine learning Liu (2004); Robert and Casella (2004). They are mainly used for computing approximately the solution of definite integrals, and by extension, of differential equations (for this reason, MC schemes can be considered stochastic quadrature rules). Although exact analytical solutions to integrals are always desirable, such unicorns are rarely available, specially in real-world systems. Many applications inevitably require the approximation of intractable integrals. Specifically, Bayesian methods need the computation of expectations with respect to posterior probability density function (pdf) which, generally, are analytically intractable Gelman et al. (2013). The MC methods can be divided in four main families: direct methods (based on transformations or random variables), accept-reject techniques, Markov chain Monte Carlo (MCMC) algorithms, and importance sampling (IS) schemes Luengo et al. (2020); Martino et al. (2018). The last two families are the most popular for the facility and universality of their possible application Liang et al. (2010); Liu (2004); Robert and Casella (2004). All the MC methods require the choice of a suitable proposal density that is crucial for their performance Luengo et al. (2020); Robert and Casella (2004).
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- Europe > United Kingdom > England (0.04)
- Research Report (0.40)
- Instructional Material (0.40)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible. The problem's importance has attracted many proposed solutions, including importance sampling (IS), self-normalized IS (SNIS), and doubly robust (DR) estimates. DR and its variants ensure semiparametric local efficiency if Q-functions are well-specified, but if they are not they can be worse than both IS and SNIS. It also does not enjoy SNIS's inherent stability and boundedness. We propose new estimators for OPE based on empirical likelihood that are always more efficient than IS, SNIS, and DR and satisfy the same stability and boundedness properties as SNIS.
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Kallus, Nathan, Uehara, Masatoshi
Off-policy evaluation (OPE) in both contextual bandits and reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible. The problem's importance has attracted many proposed solutions, including importance sampling (IS), self-normalized IS (SNIS), and doubly robust (DR) estimates. DR and its variants ensure semiparametric local efficiency if Q-functions are well-specified, but if they are not they can be worse than both IS and SNIS. It also does not enjoy SNIS's inherent stability and boundedness. We propose new estimators for OPE based on empirical likelihood that are always more efficient than IS, SNIS, and DR and satisfy the same stability and boundedness properties as SNIS. On the way, we categorize various properties and classify existing estimators by them.
Energy-Inspired Models: Learning with Sampler-Induced Distributions
Lawson, Dieterich, Tucker, George, Dai, Bo, Ranganath, Rajesh
Energy-based models (EBMs) are powerful probabilistic models, but suffer from intractable sampling and density evaluation due to the partition function. As a result, inference in EBMs relies on approximate sampling algorithms, leading to a mismatch between the model and inference. Motivated by this, we consider the sampler-induced distribution as the model of interest and maximize the likelihood of this model. This yields a class of energy-inspired models (EIMs) that incorporate learned energy functions while still providing exact samples and tractable log-likelihood lower bounds. We describe and evaluate three instantiations of such models based on truncated rejection sampling, self-normalized importance sampling, and Hamiltonian importance sampling. These models outperform or perform comparably to the recently proposed Learned Accept/Reject Sampling algorithm and provide new insights on ranking Noise Contrastive Estimation and Contrastive Predictive Coding. Moreover, EIMs allow us to generalize a recent connection between multi-sample variational lower bounds and auxiliary variable variational inference. We show how recent variational bounds can be unified with EIMs as the variational family.
- North America > United States > Colorado (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Global Autoregressive Models for Data-Efficient Sequence Learning
Parshakova, Tetiana, Andreoli, Jean-Marc, Dymetman, Marc
Standard autoregressive seq2seq models are easily trained by max-likelihood, but tend to show poor results under small-data conditions. We introduce a class of seq2seq models, GAMs (Global Autoregressive Models), which combine an autoregressive component with a log-linear component, allowing the use of global \textit{a priori} features to compensate for lack of data. We train these models in two steps. In the first step, we obtain an \emph{unnormalized} GAM that maximizes the likelihood of the data, but is improper for fast inference or evaluation. In the second step, we use this GAM to train (by distillation) a second autoregressive model that approximates the \emph{normalized} distribution associated with the GAM, and can be used for fast inference and evaluation. Our experiments focus on language modelling under synthetic conditions and show a strong perplexity reduction of using the second autoregressive model over the standard one.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe (0.04)
- Asia > Middle East > Jordan (0.04)
- (3 more...)