binary activated deep neural network
Reviews: Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks
One contribution is a new approach for training neural networks with binary activations. The second contribution is PAC-Bayesian generalization bounds for binary activated neural networks that, when used as the training objective, come very close to test accuracy (i.e. The gap between the training and test performance is also much smaller. I think this is very promising for training more robust networks. The method actually recovers variational Bayesian learning when the coefficient C is fixed, but in contrast to it, this coefficient is learned in a principled way.
Reviews: Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks
This work studies PAC-Bayes bound optimization in the setting of deep neural networks with binary activations. One of the stated contributions of the paper---showing how to optimize despite the binary activations providing no naive derivative---is, in fact, a known technique in the literature on variational inference. This somewhat undermines the impact of the work, though importing these ideas into the PAC-Bayes community is nice. The other contribution is obtaining nonvacuous bounds and here it is impressive to see such tight bounds. I have a few issues to raise with the introduction, which I would like addressed in revisions: First, the authors write: "Although informative, these results upper bound the prediction error of a (stochastic) neural network with perturbed weights, which is not the one used to predict in practice".
Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks
We present a comprehensive study of multilayer neural networks with binary activation, relying on the PAC-Bayesian theory. Our contributions are twofold: (i) we develop an end-to-end framework to train a binary activated deep neural network, (ii) we provide nonvacuous PAC-Bayesian generalization bounds for binary activated deep neural networks. Our results are obtained by minimizing the expected loss of an architecture-dependent aggregation of binary activated deep neural networks. Our analysis inherently overcomes the fact that binary activation function is non-differentiable. The performance of our approach is assessed on a thorough numerical experiment protocol on real-life datasets.
Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks
Letarte, Gaël, Germain, Pascal, Guedj, Benjamin, Laviolette, Francois
We present a comprehensive study of multilayer neural networks with binary activation, relying on the PAC-Bayesian theory. Our contributions are twofold: (i) we develop an end-to-end framework to train a binary activated deep neural network, (ii) we provide nonvacuous PAC-Bayesian generalization bounds for binary activated deep neural networks. Our results are obtained by minimizing the expected loss of an architecture-dependent aggregation of binary activated deep neural networks. Our analysis inherently overcomes the fact that binary activation function is non-differentiable. The performance of our approach is assessed on a thorough numerical experiment protocol on real-life datasets.