Goto

Collaborating Authors

 Europe


A Tighter Bound for Graphical Models

Neural Information Processing Systems

The neurons in these networks are the random variables, whereas the connections between them model the causal dependencies. Usually, some of the nodes have a direct relation with the random variables in the problem and are called'visibles'. The other nodes, known as'hiddens', are used to model more complex probability distributions. Learning in graphical models can be done as long as the likelihood that the visibles correspond to a pattern in the data set, can be computed. In general the time it takes, scales exponentially with the number of hidden neurons.


Foundations for a Circuit Complexity Theory of Sensory Processing

Neural Information Processing Systems

We introduce total wire length as salient complexity measure for an analysis of the circuit complexity of sensory processing in biological neural systems and neuromorphic engineering. This new complexity measure is applied to a set of basic computational problems that apparently need to be solved by circuits for translation-and scale-invariant sensory processing. We exhibit new circuit design strategies for these new benchmark functions that can be implemented within realistic complexity bounds, in particular with linear or almost linear total wire length. 1 Introduction Circuit complexity theory is a classical area of theoretical computer science, that provides estimates for the complexity of circuits for computing specific benchmark functions, such as binary addition, multiplication and sorting (see, e.g.


Second Order Approximations for Probability Models

Neural Information Processing Systems

In this paper, we derive a second order mean field theory for directed graphical probability models. By using an information theoretic argument it is shown how this can be done in the absense of a partition function. This method is a direct generalisation of the well-known TAP approximation for Boltzmann Machines. In a numerical example, it is shown that the method greatly improves the first order mean field approximation. For a restricted class of graphical models, so-called single overlap graphs, the second order method has comparable complexity to the first order method. For sigmoid belief networks, the method is shown to be particularly fast and effective.


Permitted and Forbidden Sets in Symmetric Threshold-Linear Networks

Neural Information Processing Systems

Ascribing computational principles to neural feedback circuits is an important problem in theoretical neuroscience. We study symmetric threshold-linear networks and derive stability results that go beyond the insights that can be gained from Lyapunov theory or energy functions. By applying linear analysis to subnetworks composed of coactive neurons, we determine the stability of potential steady states. We find that stability depends on two types of eigenmodes. One type determines global stability and the other type determines whether or not multistability is possible.


From Margin to Sparsity

Neural Information Processing Systems

We present an improvement of Novikoff's perceptron convergence theorem. Reinterpreting this mistake bound as a margin dependent sparsity guarantee allows us to give a PACstyle generalisation error bound for the classifier learned by the perceptron learning algorithm. The bound value crucially depends on the margin a support vector machine would achieve on the same data set using the same kernel. Ironically, the bound yields better guarantees than are currently available for the support vector solution itself.


Competition and Arbors in Ocular Dominance

Neural Information Processing Systems

Hebbian and competitive Hebbian algorithms are almost ubiquitous in modeling pattern formation in cortical development. We analyse in theoretical detail a particular model (adapted from Piepenbrock & Obermayer, 1999) for the development of Id stripe-like patterns, which places competitive and interactive cortical influences, and free and restricted initial arborisation onto a common footing. 1 Introduction Cats, many species of monkeys, and humans exibit ocular dominance stripes, which are alternating areas of primary visual cortex devoted to input from (the thalamic relay associated with) just one or the other eye (see Erwin et aI, 1995; Miller, 1996; Swindale, 1996 for reviews of theory and data). These well-known fingerprint patterns have been a seductive target for models of cortical pattern formation because of the mix of competition and cooperation they suggest. A wealth of synaptic adaptation algorithms has been suggested to account for them (and also the concomitant refinement of the topography of the map between the eyes and the cortex), many of which are based on forms of Hebbian learning. Critical issues for the models are the degree of correlation between inputs from the eyes, the nature of the initial arborisation of the axonal inputs, the degree and form of cortical competition, and the nature of synaptic saturation (preventing weights from changing sign or getting too large) and normalisation (allowing cortical and/or thalamic cells to support only a certain total synaptic weight).


Algorithmic Stability and Generalization Performance

Neural Information Processing Systems

Until recently, most of the research in that area has focused on uniform a-priori bounds giving a guarantee that the difference between the training error and the test error is uniformly small for any hypothesis in a given class. These bounds are usually expressed in terms of combinatorial quantities such as VCdimension. In the last few years, researchers have tried to use more refined quantities to either estimate the complexity of the search space (e.g.


Efficient Learning of Linear Perceptrons

Neural Information Processing Systems

The resulting combinatorial problem - finding the best agreement half-space over an input sample - is NP hard to approximate to within some constant factor. We suggest a way to circumvent this theoretical bound by introducing a new measure of success for such algorithms. An algorithm is ILmargin successful if the agreement ratio of the half-space it outputs is as good as that of any half-space once training points that are inside the ILmargins of its separating hyper-plane are disregarded. We prove crisp computational complexity results with respect to this success measure: On one hand, for every positive IL, there exist efficient (poly-time) ILmargin successful learning algorithms. On the other hand, we prove that unless P NP, there is no algorithm that runs in time polynomial in the sample size and in 1/ IL that is ILmargin successful for all IL O. 1 Introduction We consider the computational complexity of learning linear perceptrons for arbitrary (Le.


Whence Sparseness?

Neural Information Processing Systems

It has been shown that the receptive fields of simple cells in VI can be explained by assuming optimal encoding, provided that an extra constraint of sparseness is added. This finding suggests that there is a reason, independent of optimal representation, for sparseness. However this work used an ad hoc model for the noise. Here I show that, if a biologically more plausible noise model, describing neurons as Poisson processes, is used sparseness does not have to be added as a constraint. Thus I conclude that sparseness is not a feature that evolution has striven for, but is simply the result of the evolutionary pressure towards an optimal representation.


Natural Sound Statistics and Divisive Normalization in the Auditory System

Neural Information Processing Systems

We explore the statistical properties of natural sound stimuli preprocessed with a bank of linear filters. The responses of such filters exhibit a striking form of statistical dependency, in which the response variance of each filter grows with the response amplitude of filters tuned for nearby frequencies. These dependencies may be substantially reduced using an operation known as divisive normalization, in which the response of each filter is divided by a weighted sum of the rectified responses of other filters. The weights may be chosen to maximize the independence of the normalized responses for an ensemble of natural sounds. We demonstrate that the resulting model accounts for nonlinearities in the response characteristics of the auditory nerve, by comparing model simulations to electrophysiological recordings.