Goto

Collaborating Authors

 Bialek, William


An Information Theoretic Approach to the Functional Classification of Neurons

Neural Information Processing Systems

A population of neurons typically exhibits a broad diversity of responses to sensory inputs. The intuitive notion of functional classification is that cells can be clustered so that most of the diversity is captured by the identity ofthe clusters rather than by individuals within clusters. We show how this intuition can be made precise using information theory, without anyneed to introduce a metric on the space of stimuli or responses. Applied to the retinal ganglion cells of the salamander, this approach recovers classicalresults, but also provides clear evidence for subclasses beyond those identified previously. Further, we find that each of the ganglion cellsis functionally unique, and that even within the same subclass only a few spikes are needed to reliably distinguish between cells.


Entropy and Inference, Revisited

Neural Information Processing Systems

We study properties of popular near-uniform (Dirichlet) priors for learning undersampled probability distributions on discrete nonmetric spaces and show that they lead to disastrous results. However, an Occam-style phase space argument expands the priors into their infinite mixture and resolves most of the observed problems. This leads to a surprisingly good estimator of entropies of discrete distributions. Learning a probability distribution from examples is one of the basic problems in data analysis. Common practical approaches introduce a family of parametric models, leading to questions about model selection. In Bayesian inference, computing the total probability of the data arising from a model involves an integration over parameter space, and the resulting "phase space volume" automatically discriminates against models with larger numbers of parameters--hence the description of these volume terms as Occam factors [1, 2]. As we move from finite parameterizations to models that are described by smooth functions, the integrals over parameter space become functional integrals and methods from quantum field theory allow us to do these integrals asymptotically; again the volume in model space consistent with the data is larger for models that are smoother and hence less complex [3]. Further, at least under some conditions the relevant degree of smoothness can be determined self-consistently from the data, so that we approach something like a model independent method for learning a distribution [4]. The results emphasizing the importance of phase space factors in learning prompt us to look back at a seemingly much simpler problem, namely learning a distribution on a discrete, nonmetric space.



Universality and Individuality in a Neural Code

Neural Information Processing Systems

This basic question in the theory of knowledge seems to be beyond the scope of experimental investigation. An accessible version of this question is whether different observers of the same sense data have the same neural representation of these data: how much of the neural code is universal, and how much is individual? Differences in the neural codes of different individuals may arise from various sources: First, different individuals may use different'vocabularies' of coding symbols. Second, they may use the same symbols to encode different stimulus features.


Universality and Individuality in a Neural Code

Neural Information Processing Systems

This basic question in the theory of knowledge seems to be beyond the scope of experimental investigation. An accessible version of this question is whether different observers of the same sense data have the same neural representation of these data: how much of the neural code is universal, and how much is individual? Differences in the neural codes of different individuals may arise from various sources: First, different individuals may use different'vocabularies' of coding symbols. Second, they may use the same symbols to encode different stimulus features. Third, they may have different latencies, so they'say' the same things at slightly different times.


What Can a Single Neuron Compute?

Neural Information Processing Systems

What can a single neuron compute? Abstract In this paper we formulate a description of the computation performed by a neuron as a combination of dimensional reduction and nonlinearity. We implement this description for the Hodgkin Huxley model, identify the most relevant dimensions and find the nonlinearity. A two dimensional description already captures a significant fraction of the information that spikes carry about dynamic inputs. This description also shows that computation in the Hodgkin-Huxley model is more complex than a simple integrateand-fire or perceptron model. 1 Introduction Classical neural network models approximate neurons as devices that sum their inputs and generate a nonzero output if the sum exceeds a threshold.


Learning Continuous Distributions: Simulations With Field Theoretic Priors

Neural Information Processing Systems

Learning of a smooth but nonparametric probability density can be regularized using methods of Quantum Field Theory. We implement a field theoretic prior numerically, test its efficacy, and show that the free parameter of the theory (,smoothness scale') can be determined self consistently by the data; this forms an infinite dimensional generalization of the MDL principle. Finally, we study the implications of one's choice of the prior and the parameterization and conclude that the smoothness scale determination makes density estimation very weakly sensitive to the choice of the prior, and that even wrong choices can be advantageous for small data sets. One of the central problems in learning is to balance'goodness of fit' criteria against the complexity of models. An important development in the Bayesian approach was thus the realization that there does not need to be any extra penalty for model complexity: if we compute the total probability that data are generated by a model, there is a factor from the volume in parameter space-the'Occam factor' -that discriminates against models with more parameters [1, 2].


What Can a Single Neuron Compute?

Neural Information Processing Systems

We implement this description for the Hodgkin Huxley model, identify the most relevant dimensions and find the nonlinearity. A two dimensional description already captures a significant fraction of the information that spikes carry about dynamic inputs.This description also shows that computation in the Hodgkin-Huxley model is more complex than a simple integrateand-fire orperceptron model. 1 Introduction Classical neural network models approximate neurons as devices that sum their inputs and generate a nonzero output if the sum exceeds a threshold. From our current state of knowledge in neurobiology it is easy to criticize these models as oversimplified: whereis the complex geometry of neurons, or the many different kinds of ion channel, each with its own intricate multistate kinetics? Indeed, progress at this more microscopic level of description has led us to the point where we can write (almost) exact models for the electrical dynamics of neurons, at least on short time scales. These nearly exact models are complicated by any measure, including tens if not hundreds of differential equations to describe the states of different channels in different spatial compartments of the cell. Faced with this detailed microscopic description, we need to answer a question which goes well beyond the biological context: given a continuous dynamical system, what does it compute? Our goal in this paper is to make this question about what a neuron computes somewhat moreprecise, and then to explore what we take to be the simplest example, namely the Hodgkin-Huxley model [1],[2] (and refs therein).


Learning Continuous Distributions: Simulations With Field Theoretic Priors

Neural Information Processing Systems

Learning of a smooth but nonparametric probability density can be regularized usingmethods of Quantum Field Theory. We implement a field theoretic prior numerically, test its efficacy, and show that the free parameter ofthe theory (,smoothness scale') can be determined self consistently bythe data; this forms an infinite dimensional generalization of the MDL principle. Finally, we study the implications of one's choice of the prior and the parameterization and conclude that the smoothness scale determination makes density estimation very weakly sensitive to the choice of the prior, and that even wrong choices can be advantageous for small data sets. One of the central problems in learning is to balance'goodness of fit' criteria against the complexity of models. An important development in the Bayesian approach was thus the realization that there does not need to be any extra penalty for model complexity: if we compute the total probability that data are generated by a model, there is a factor from the volume in parameter space-the'Occam factor' -that discriminates against models with more parameters [1, 2].


Statistical Reliability of a Blowfly Movement-Sensitive Neuron

Neural Information Processing Systems

We develop a model-independent method for characterizing the reliability of neural responses to brief stimuli. This approach allows us to measure the discriminability of similar stimuli, based on the real-time response of a single neuron. Neurophysiological data were obtained from a movementsensitive neuron(HI) in the visual system of the blowfly Calliphom erythrocephala. Furthermore,recordings were made from blowfly photoreceptor cells to quantify the signal to noise ratios in the peripheral visual system. As photoreceptors form the input to the visual system, the reliability oftheir signals ultimately determines the reliability of any visual discrimination task. For the case of movement detection, this limit can be computed, and compared to the HI neuron's reliability. Under favorable conditions,the performance of the HI neuron closely approaches the theoretical limit, which means that under these conditions the nervous system adds little noise in the process of computing movement from the correlations of signals in the photoreceptor array.