Goto

Collaborating Authors

 neural processing system



Bayesian vs. PAC-Bayesian Deep Neural Network Ensembles

arXiv.org Artificial Intelligence

Bayesian neural networks address epistemic uncertainty by learning a posterior distribution over model parameters. Sampling and weighting networks according to this posterior yields an ensemble model referred to as Bayes ensemble. Ensembles of neural networks (deep ensembles) can profit from the cancellation of errors effect: Errors by ensemble members may average out and the deep ensemble achieves better predictive performance than each individual network. We argue that neither the sampling nor the weighting in a Bayes ensemble are particularly well-suited for increasing generalization performance, as they do not support the cancellation of errors effect, which is evident in the limit from the Bernstein-von~Mises theorem for misspecified models. In contrast, a weighted average of models where the weights are optimized by minimizing a PAC-Bayesian generalization bound can improve generalization performance. This requires that the optimization takes correlations between models into account, which can be achieved by minimizing the tandem loss at the cost that hold-out data for estimating error correlations need to be available. The PAC-Bayesian weighting increases the robustness against correlated models and models with lower performance in an ensemble. This allows us to safely add several models from the same learning process to an ensemble, instead of using early-stopping for selecting a single weight configuration. Our study presents empirical results supporting these conceptual considerations on four different classification datasets. We show that state-of-the-art Bayes ensembles from the literature, despite being computationally demanding, do not improve over simple uniformly weighted deep ensembles and cannot match the performance of deep ensembles weighted by optimizing the tandem loss, which additionally come with non-vacuous generalization guarantees.


Provably Convergent Algorithms for Solving Inverse Problems Using Generative Models

arXiv.org Machine Learning

The traditional approach of hand-crafting priors (such as sparsity) for solving inverse problems is slowly being replaced by the use of richer learned priors (such as those modeled by deep generative networks). In this work, we study the algorithmic aspects of such a learning-based approach from a theoretical perspective. For certain generative network architectures, we establish a simple non-convex algorithmic approach that (a) theoretically enjoys linear convergence guarantees for certain linear and nonlinear inverse problems, and (b) empirically improves upon conventional techniques such as back-propagation. We support our claims with the experimental results for solving various inverse problems. We also propose an extension of our approach that can handle model mismatch (i.e., situations where the generative network prior is not exactly applicable). Together, our contributions serve as building blocks towards a principled use of generative models in inverse problems with more complete algorithmic understanding.


Recipe for neuromorphic processing systems?

#artificialintelligence

IMAGE: Like any recipe, an ideal memristive neuromorphic computing system requires a special blend of CMOS circuits and memristive devices, as well as spatial resources and temporal dynamics that must be... view more WASHINGTON, March 24, 2020 -- During the 1990s, Carver Mead and colleagues combined basic research in neuroscience with elegant analog circuit design in electronic engineering. This pioneering work on neuromorphic electronic circuits inspired researchers in Germany and Switzerland to explore the possibility of reproducing the physics of real neural circuits by using the physics of silicon. The field of "brain-mimicking" neuromorphic electronics shows great potential not only for basic research but also for commercial exploitation of always-on edge computing and "internet of things" applications. In Applied Physics Letters, from AIP Publishing, Elisabetta Chicca, from Bielefeld University, and Giacomo Indiveri, from the University of Zurich and ETH Zurich, present their work to understand how neural processing systems in biology carry out computation, as well as a recipe to reproduce these computing principles in mixed signal analog/digital electronics and novel materials. One of the most distinctive computational features of neural networks is learning, so Chicca and Indiveri are particularly interested in reproducing the adaptive and plastic properties of real synapses.