to

### An Introduction to Variational Autoencoders

Variational autoencoders provide a principled framework for learning deep latent-variable models and corresponding inference models. In this work, we provide an introduction to variational autoencoders and some important extensions.

### Implementing Variational Autoencoders in Keras: Beyond the Quickstart

It is a very well-designed library that clearly abides by its guiding principles of modularity and extensibility, enabling us to easily assemble powerful, complex models from primitive building blocks. This has been demonstrated in numerous blog posts and tutorials, in particular, the excellent tutorial on Building Autoencoders in Keras. As the name suggests, that tutorial provides examples of how to implement various kinds of autoencoders in Keras, including the variational autoencoder (VAE) [1]. Visualization of 2D manifold of MNIST digits (left) and the representation of digits in latent space colored according to their digit labels (right). Like all autoencoders, the variational autoencoder is primarily used for unsupervised learning of hidden representations.

### Variational Bayes on Monte Carlo Steroids

Variational approaches are often used to approximate intractable posteriors or normalization constants in hierarchical latent variable models. While often effective in practice, it is known that the approximation error can be arbitrarily large. We propose a new class of bounds on the marginal log-likelihood of directed latent variable models. Our approach relies on random projections to simplify the posterior. In contrast to standard variational methods, our bounds are guaranteed to be tight with high probability. We provide a new approach for learning latent variable models based on optimizing our new bounds on the log-likelihood. We demonstrate empirical improvements on benchmark datasets in vision and language for sigmoid belief networks, where a neural network is used to approximate the posterior.

### Least Square Variational Bayesian Autoencoder with Regularization

In recent years Variation Autoencoders have become one of the most popular unsupervised learning of complicated distributions.Variational Autoencoder (VAE) provides more efficient reconstructive performance over a traditional autoencoder. Variational auto enocders make better approximaiton than MCMC. The VAE defines a generative process in terms of ancestral sampling through a cascade of hidden stochastic layers. They are a directed graphic models. Variational autoencoder is trained to maximise the variational lower bound. Here we are trying maximise the likelihood and also at the same time we are trying to make a good approximation of the data. Its basically trading of the data log-likelihood and the KL divergence from the true posterior. This paper describes the scenario in which we wish to find a point-estimate to the parameters $\theta$ of some parametric model in which we generate each observations by first sampling a local latent variable and then sampling the associated observation. Here we use least square loss function with regularization in the the reconstruction of the image, the least square loss function was found to give better reconstructed images and had a faster training time.

### Variational Dropout and the Local Reparameterization Trick

We explore an as yet unexploited opportunity for drastically improving the efficiency of stochastic gradient variational Bayes (SGVB) with global model parameters. Regular SGVB estimators rely on sampling of parameters once per minibatch of data, and have variance that is constant w.r.t. the minibatch size. The efficiency of such estimators can be drastically improved upon by translating uncertainty about global parameters into local noise that is independent across datapoints in the minibatch. Such reparameterizations with local noise can be trivially parallelized and have variance that is inversely proportional to the minibatch size, generally leading to much faster convergence.We find an important connection with regularization by dropout: the original Gaussian dropout objective corresponds to SGVB with local noise, a scale-invariant prior and proportionally fixed posterior variance. Our method allows inference of more flexibly parameterized posteriors; specifically, we propose \emph{variational dropout}, a generalization of Gaussian dropout, but with a more flexibly parameterized posterior, often leading to better generalization. The method is demonstrated through several experiments.