Goto

Collaborating Authors

 rbm network


On the Representational Efficiency of Restricted Boltzmann Machines

Neural Information Processing Systems

This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)? We characterize the RBM's unnormalized log-likelihood function as a type of neural network (called an RBM network), and through a series of simulation results relate these networks to types that are better understood. We show the surprising result that RBM networks can efficiently compute any function that depends on the number of 1's in the input, such as parity. We also provide the first known example of a particular type of distribution which provably cannot be efficiently represented by an RBM (or equivalently, cannot be efficiently computed by an RBM network), assuming a realistic exponential upper bound on the size of the weights. By formally demonstrating that a relatively simple distribution cannot be represented efficiently by an RBM our results provide a new rigorous justification for the use of potentially more expressive generative models, such as deeper ones.


7bb060764a818184ebb1cc0d43d382aa-Reviews.html

Neural Information Processing Systems

The paper provides a theoretical analysis of the types of distributions that can efficiently be represented by restricted Boltzmann machines (RBMs). The analysis is based on a representation of the unnormalized log probability (free energy) of RBMs as a special form of neural network (NN). The paper relates these RBM networks to more common types of NNs whose properties have been studied in the literature. This approach allows the authors to identify two non-trivial examples of functions that can and cannot be represented efficiently by RBM networks - and hence related distributions can / cannot be modeled efficiently by RBMs. Specifically they show that RBM networks can efficiently represent any function that only depends on the number of non-zero visible units, such as parity, but that they are unable to represent the only somewhat more difficult example of inner product parity.


On the Representational Efficiency of Restricted Boltzmann Machines James Martens Richard Zemel Department of Computer Science

Neural Information Processing Systems

This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)? We characterize the RBM's unnormalized log-likelihood function as a type of neural network, and through a series of simulation results relate these networks to ones whose representational properties are better understood. We show the surprising result that RBMs can efficiently capture any distribution whose density depends on the number of 1's in their input. We also provide the first known example of a particular type of distribution that provably cannot be efficiently represented by an RBM, assuming a realistic exponential upper bound on the weights. By formally demonstrating that a relatively simple distribution cannot be represented efficiently by an RBM our results provide a new rigorous justification for the use of potentially more expressive generative models, such as deeper ones.


On the Representational Efficiency of Restricted Boltzmann Machines

Martens, James, Chattopadhya, Arkadev, Pitassi, Toni, Zemel, Richard

Neural Information Processing Systems

This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)? We characterize the RBM's unnormalized log-likelihood function as a type of neural network (called an RBM network), and through a series of simulation results relate these networks to types that are better understood. We show the surprising result that RBM networks can efficiently compute any function that depends on the number of 1's in the input, such as parity. We also provide the first known example of a particular type of distribution which provably cannot be efficiently represented by an RBM (or equivalently, cannot be efficiently computed by an RBM network), assuming a realistic exponential upper bound on the size of the weights. By formally demonstrating that a relatively simple distribution cannot be represented efficiently by an RBM our results provide a new rigorous justification for the use of potentially more expressive generative models, such as deeper ones.


Short sighted deep learning

Koch, Ellen de Melllo, Koch, Anita de Mello, Kastanos, Nicholas, Cheng, Ling

arXiv.org Machine Learning

A theory explaining how deep learning works is yet to be developed. Previous work suggests that deep learning performs a coarse graining, similar in spirit to the renormalization group (RG). This idea has been explored in the setting of a local (nearest neighbor interactions) Ising spin lattice. We extend the discussion to the setting of a long range spin lattice. Markov Chain Monte Carlo (MCMC) simulations determine both the critical temperature and scaling dimensions of the system. The model is used to train both a single RBM (restricted Boltzmann machine) network, as well as a stacked RBM network. Following earlier Ising model studies, the trained weights of a single layer RBM network define a flow of lattice models. In contrast to results for nearest neighbor Ising, the RBM flow for the long ranged model does not converge to the correct values for the spin and energy scaling dimension. Further, correlation functions between visible and hidden nodes exhibit key differences between the stacked RBM and RG flows. The stacked RBM flow appears to move towards low temperatures whereas the RG flow moves towards high temperature. This again differs from results obtained for nearest neighbor Ising.


On the Representational Efficiency of Restricted Boltzmann Machines

Martens, James, Chattopadhya, Arkadev, Pitassi, Toni, Zemel, Richard

Neural Information Processing Systems

This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)? We characterize the RBM's unnormalized log-likelihood function as a type of neural network (called an RBM network), and through a series of simulation results relate these networks to types that are better understood. We show the surprising result that RBM networks can efficiently compute any function that depends on the number of 1's in the input, such as parity. We also provide the first known example of a particular type of distribution which provably cannot be efficiently represented by an RBM (or equivalently, cannot be efficiently computed by an RBM network), assuming a realistic exponential upper bound on the size of the weights. By formally demonstrating that a relatively simple distribution cannot be represented efficiently by an RBM our results provide a new rigorous justification for the use of potentially more expressive generative models, such as deeper ones.