Goto

Collaborating Authors

 constructing deep neural network


Constructing Deep Neural Networks by Bayesian Network Structure Learning

Neural Information Processing Systems

We introduce a principled approach for unsupervised structure learning of deep neural networks. We propose a new interpretation for depth and inter-layer connectivity where conditional independencies in the input distribution are encoded hierarchically in the network structure. Thus, the depth of the network is determined inherently. The proposed method casts the problem of neural network structure learning as a problem of Bayesian network structure learning. Then, instead of directly learning the discriminative structure, it learns a generative graph, constructs its stochastic inverse, and then constructs a discriminative graph. We prove that conditional-dependency relations among the latent variables in the generative graph are preserved in the class-conditional discriminative graph. We demonstrate on image classification benchmarks that the deepest layers (convolutional and dense) of common networks can be replaced by significantly smaller learned structures, while maintaining classification accuracy---state-of-the-art on tested benchmarks. Our structure learning algorithm requires a small computational cost and runs efficiently on a standard desktop CPU.


Reviews: Constructing Deep Neural Networks by Bayesian Network Structure Learning

Neural Information Processing Systems

The presented method learns a structure of a deep ANN by first learning a BN and then constructing the ANN from this BN. The authors state that they "propose a new interpretation for depth and inter-layer connectivity in deep neural networks". Neurons in deep layers represent low-order conditional independencies (ie small conditioning set) and those in'early' (non-deep) layers represent high-order CI relationships. These are all CI relations in the "X" ie the input vector of (observed) random variables. Perhaps I am missing something here but I could not find an argument as to why this is a principled way to build deep ANNs with good performance.


Constructing Deep Neural Networks by Bayesian Network Structure Learning

Neural Information Processing Systems

We introduce a principled approach for unsupervised structure learning of deep neural networks. We propose a new interpretation for depth and inter-layer connectivity where conditional independencies in the input distribution are encoded hierarchically in the network structure. Thus, the depth of the network is determined inherently. The proposed method casts the problem of neural network structure learning as a problem of Bayesian network structure learning. Then, instead of directly learning the discriminative structure, it learns a generative graph, constructs its stochastic inverse, and then constructs a discriminative graph.


Constructing Deep Neural Networks by Bayesian Network Structure Learning

arXiv.org Artificial Intelligence

We introduce a principled approach for unsupervised structure learning of deep neural networks. We propose a new interpretation for depth and inter-layer connectivity where conditional independencies in the input distribution are encoded hierarchically in the network structure. Thus, the depth of the network is determined inherently (equal to the maximal order of independence in the input distribution). The proposed method casts the problem of neural network structure learning as a problem of Bayesian network structure learning. Then, instead of directly learning the discriminative structure, it learns a generative graph, constructs its stochastic inverse, and then constructs a discriminative graph. We prove that conditional-dependency relations among the latent variables in the generative graph are preserved in the class-conditional discriminative graph. We demonstrate on image classification benchmarks that the deepest layers (convolutional and dense) of common networks can be replaced by significantly smaller learned structures, while maintaining classification accuracy---state-of-the-art on tested benchmarks. Our structure learning algorithm requires a small computational cost and runs efficiently on a standard desktop CPU.