nade-k
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Summary: The paper introduces an iterative extension of NADE (Neural autoregressive distribution estimator), a generative model that uses a neural network with a variable number of inputs to model each conditional in an autoregressive factorization of a joint distribution. The paper builds up on top of an order-agnostic version of NADE where all dimensions not present in the input are modelled independently by the network at each autoregressive step. The main idea introduced in the paper is using a prediction of the missing inputs at each iteration, starting with the marginal probability distribution over the training data, and the factorial (each dimension is predicted independently conditioned on the input) approximation obtained from NADE in the following iterations. The authors hypothesise that prediction in several steps is easier than in one step.
Iterative Neural Autoregressive Distribution Estimator NADE-k
Tapani Raiko, Yao Li, Kyunghyun Cho, Yoshua Bengio
Training of the neural autoregressive density estimator (NADE) can be viewed as doing one step of probabilistic inference on missing values in data. We propose a new model that extends this inference scheme to multiple steps, arguing that it is easier to learn to improve a reconstruction in k steps rather than to learn to reconstruct in a single inference step. The proposed model is an unsupervised building block for deep learning that combines the desirable properties of NADE and multi-prediction training: (1) Its test likelihood can be computed analytically, (2) it is easy to generate independent samples from it, and (3) it uses an inference engine that is a superset of variational inference for Boltzmann machines. The proposed NADE-k is competitive with the state-of-the-art in density estimation on the two datasets tested.
Iterative Neural Autoregressive Distribution Estimator (NADE-k)
Training of the neural autoregressive density estimator (NADE) can be viewed as doing one step of probabilistic inference on missing values in data. We propose a new model that extends this inference scheme to multiple steps, arguing that it is easier to learn to improve a reconstruction in k steps rather than to learn to reconstruct in a single inference step. The proposed model is an unsupervised building block for deep learning that combines the desirable properties of NADE and multi-prediction training: (1) Its test likelihood can be computed analytically, (2) it is easy to generate independent samples from it, and (3) it uses an inference engine that is a superset of variational inference for Boltzmann machines. The proposed NADE-k is competitive with the state-of-the-art in density estimation on the two datasets tested.
Iterative Neural Autoregressive Distribution Estimator NADE-k
Raiko, Tapani, Li, Yao, Cho, Kyunghyun, Bengio, Yoshua
Training of the neural autoregressive density estimator (NADE) can be viewed as doing one step of probabilistic inference on missing values in data. We propose a new model that extends this inference scheme to multiple steps, arguing that it is easier to learn to improve a reconstruction in $k$ steps rather than to learn to reconstruct in a single inference step. The proposed model is an unsupervised building block for deep learning that combines the desirable properties of NADE and multi-predictive training: (1) Its test likelihood can be computed analytically, (2) it is easy to generate independent samples from it, and (3) it uses an inference engine that is a superset of variational inference for Boltzmann machines. The proposed NADE-k is competitive with the state-of-the-art in density estimation on the two datasets tested.
Iterative Neural Autoregressive Distribution Estimator (NADE-k)
Raiko, Tapani, Yao, Li, Cho, Kyunghyun, Bengio, Yoshua
Training of the neural autoregressive density estimator (NADE) can be viewed as doing one step of probabilistic inference on missing values in data. We propose a new model that extends this inference scheme to multiple steps, arguing that it is easier to learn to improve a reconstruction in $k$ steps rather than to learn to reconstruct in a single inference step. The proposed model is an unsupervised building block for deep learning that combines the desirable properties of NADE and multi-predictive training: (1) Its test likelihood can be computed analytically, (2) it is easy to generate independent samples from it, and (3) it uses an inference engine that is a superset of variational inference for Boltzmann machines. The proposed NADE-k is competitive with the state-of-the-art in density estimation on the two datasets tested.