Goto

Collaborating Authors

 Country


An Information Theoretic Approach to the Functional Classification of Neurons

Neural Information Processing Systems

A population of neurons typically exhibits a broad diversity of responses to sensory inputs. The intuitive notion of functional classification is that cells can be clustered so that most of the diversity is captured by the identity ofthe clusters rather than by individuals within clusters. We show how this intuition can be made precise using information theory, without anyneed to introduce a metric on the space of stimuli or responses. Applied to the retinal ganglion cells of the salamander, this approach recovers classicalresults, but also provides clear evidence for subclasses beyond those identified previously. Further, we find that each of the ganglion cellsis functionally unique, and that even within the same subclass only a few spikes are needed to reliably distinguish between cells.


Discriminative Densities from Maximum Contrast Estimation

Neural Information Processing Systems

We propose a framework for classifier design based on discriminative densities for representation of the differences of the class-conditional distributions ina way that is optimal for classification. The densities are selected from a parametrized set by constrained maximization of some objective function which measures the average (bounded) difference, i.e. the contrast between discriminative densities. We show that maximization ofthe contrast is equivalent to minimization of an approximation of the Bayes risk.


Adaptive Scaling for Feature Selection in SVMs

Neural Information Processing Systems

This paper introduces an algorithm for the automatic relevance determination ofinput variables in kernelized Support Vector Machines. Relevance is measured by scale factors defining the input space metric, and feature selection is performed by assigning zero weights to irrelevant variables. The metric is automatically tuned by the minimization of the standard SVM empirical risk, where scale factors are added to the usual set of parameters defining the classifier. Feature selection is achieved by constraints encouraging the sparsity of scale factors. The resulting algorithm compares favorably to state-of-the-art feature selection procedures anddemonstrates its effectiveness on a demanding facial expression recognition problem.


Support Vector Machines for Multiple-Instance Learning

Neural Information Processing Systems

This paper presents two new formulations of multiple-instance learning as a maximum margin problem. The proposed extensions of the Support Vector Machine (SVM) learning approach lead to mixed integer quadratic programs that can be solved heuristically. Our generalization of SVMs makes a state-of-the-art classification technique, including nonlinear classification via kernels, available to an area that up to now has been largely dominated by special purpose methods. We present experimental results on a pharmaceutical dataset and on applications in automated image indexing and document categorization. 1 Introduction Multiple-instance learning (MIL) [4] is a generalization of supervised classification in which training class labels are associated with sets of patterns, or bags, instead of individual patterns. While every pattern may possess an associated true label, it is assumed that pattern labels are only indirectly accessible through labels attached to bags.


Information Diffusion Kernels

Neural Information Processing Systems

A new family of kernels for statistical learning is introduced that exploits thegeometric structure of statistical models. Based on the heat equation on the Riemannian manifold defined by the Fisher information metric,information diffusion kernels generalize the Gaussian kernel of Euclidean space, and provide a natural way of combining generative statistical modeling with nonparametric discriminative learning. As a special case, the kernels give a new approach to applying kernel-based learning algorithms to discrete data. Bounds on covering numbers for the new kernels are proved using spectral theory in differential geometry, and experimental results are presented for text classification.


The Decision List Machine

Neural Information Processing Systems

We introduce a new learning algorithm for decision lists to allow features that are constructed from the data and to allow a tradeoff betweenaccuracy and complexity. We bound its generalization error in terms of the number of errors and the size of the classifier it finds on the training data. We also compare its performance on some natural data sets with the set covering machine and the support vector machine.


Automatic Derivation of Statistical Algorithms: The EM Family and Beyond

Neural Information Processing Systems

Machine learning has reached a point where many probabilistic methods canbe understood as variations, extensions and combinations of a much smaller set of abstract themes, e.g., as different instances of the EM algorithm. This enables the systematic derivation of algorithms customized fordifferent models. Here, we describe the AUTOBAYES system which takes a high-level statistical model specification, uses powerful symbolic techniques based on schema-based program synthesis and computer algebra to derive an efficient specialized algorithm for learning that model, and generates executable code implementing that algorithm. This capability is far beyond that of code collections such as Matlab toolboxes oreven tools for model-independent optimization such as BUGS for Gibbs sampling: complex new algorithms can be generated without newprogramming, algorithms can be highly specialized and tightly crafted for the exact structure of the model and data, and efficient and commented code can be generated for different languages or systems.



Real Time Voice Processing with Audiovisual Feedback: Toward Autonomous Agents with Perfect Pitch

Neural Information Processing Systems

We have implemented a real time front end for detecting voiced speech and estimating its fundamental frequency. The front end performs the signal processing for voice-driven agents that attend to the pitch contours of human speech and provide continuous audiovisual feedback. The algorithm weuse for pitch tracking has several distinguishing features: it makes no use of FFTs or autocorrelation at the pitch period; it updates the pitch incrementally on a sample-by-sample basis; it avoids peak picking and does not require interpolation in time or frequency to obtain high resolution estimates;and it works reliably over a four octave range, in real time, without the need for postprocessing to produce smooth contours. The algorithm is based on two simple ideas in neural computation: the introduction of a purposeful nonlinearity, and the error signal of a least squares fit. The pitch tracker is used in two real time multimedia applications: avoice-to-MIDI player that synthesizes electronic music from vocalized melodies,and an audiovisual Karaoke machine with multimodal feedback. Both applications run on a laptop and display the user's pitch scrolling across the screen as he or she sings into the computer.


Automatic Acquisition and Efficient Representation of Syntactic Structures

Neural Information Processing Systems

The distributional principle according to which morphemes that occur in identical contexts belong, in some sense, to the same category [1] has been advanced as a means for extracting syntactic structures from corpus data. We extend this principle by applying it recursively, and by using mutualinformation for estimating category coherence. The resulting model learns, in an unsupervised fashion, highly structured, distributed representations of syntactic knowledge from corpora. It also exhibits promising behavior in tasks usually thought to require representations anchored in a grammar, such as systematicity.