Goto

Collaborating Authors

 Europe


Fusion of Similarity Data in Clustering

Neural Information Processing Systems

Fusing multiple information sources can yield significant benefits to successfully accomplishlearning tasks. Many studies have focussed on fusing information in supervised learning contexts. We present an approach to utilize multiple information sources in the form of similarity data for unsupervised learning. Based on similarity information, the clustering task is phrased as a nonnegative matrix factorization problem of a mixture ofsimilarity measurements. The tradeoff between the informativeness ofdata sources and the sparseness of their mixture is controlled by an entropy-based weighting mechanism. For the purpose of model selection, astability-based approach is employed to ensure the selection of the most self-consistent hypothesis. The experiments demonstrate the performance of the method on toy as well as real world data sets.


Assessing Approximations for Gaussian Process Classification

Neural Information Processing Systems

Gaussian processes are attractive models for probabilistic classification but unfortunately exact inference is analytically intractable. We compare Laplace's method and Expectation Propagation (EP) focusing on marginal likelihood estimates and predictive performance. We explain theoretically and corroborate empirically that EP is superior to Laplace. We also compare to a sophisticated MCMC scheme and show that EP is surprisingly accurate. In recent years models based on Gaussian process (GP) priors have attracted much attention in the machine learning community. Whereas inference in the GP regression model with Gaussian noise can be done analytically, probabilistic classification using GPs is analytically intractable. Several approaches to approximate Bayesian inference have been suggested, including Laplace's approximation, Expectation Propagation (EP), variational approximations and Markov chain Monte Carlo (MCMC) sampling, some of these in conjunction with generalisation bounds, online learning schemes and sparse approximations. Despite the abundance of recent work on probabilistic GP classifiers, most experimental studies provide only anecdotal evidence, and no clear picture has yet emerged, as to when and why which algorithm should be preferred.


Benchmarking Non-Parametric Statistical Tests

Neural Information Processing Systems

Although nonparametric tests have already been proposed for that purpose, statisticalsignificance tests for nonstandard measures (different from the classification error) are less often used in the literature. This paper is an attempt at empirically verifying how these tests compare with more classical tests, on various conditions. More precisely, using a very large dataset to estimate the whole "population", we analyzed the behavior ofseveral statistical test, varying the class unbalance, the compared models, the performance measure, and the sample size. The main result isthat providing big enough evaluation sets nonparametric tests are relatively reliable in all conditions.


A matching pursuit approach to sparse Gaussian process regression

Neural Information Processing Systems

In this paper we propose a new basis selection criterion for building sparse GP regression models that provides promising gains in accuracy as well as efficiency over previous methods. Our algorithm is much faster than that of Smola and Bartlett, while, in generalization it greatly outperforms theinformation gain approach proposed by Seeger et al, especially on the quality of predictive distributions.


Generalization Error Bounds for Aggregation by Mirror Descent with Averaging

Neural Information Processing Systems

For this purpose, we propose a stochastic procedure, the mirror descent, which performs gradient descent inthe dual space. The generated estimates are additionally averaged in a recursive fashion with specific weights. Mirror descent algorithms havebeen developed in different contexts and they are known to be particularly efficient in high dimensional problems. Moreover their implementation is adapted to the online setting. The main result of the paper is the upper bound on the convergence rate for the generalization error.


Integrate-and-Fire models with adaptation are good enough

Neural Information Processing Systems

Integrate-and-Fire-type models are usually criticized because of their simplicity. On the other hand, the Integrate-and-Fire model is the basis ofmost of the theoretical studies on spiking neuron models. Here, we develop a sequential procedure to quantitatively evaluate an equivalent Integrate-and-Fire-typemodel based on intracellular recordings of cortical pyramidal neurons. We find that the resulting effective model is sufficient to predict the spike train of the real pyramidal neuron with high accuracy. In in vivo-like regimes, predicted and recorded traces are almost indistinguishable and a significant part of the spikes can be predicted atthe correct timing. Slow processes like spike-frequency adaptation are shown to be a key feature in this context since they are necessary for the model to connect between different driving regimes.


Efficient Estimation of OOMs

Neural Information Processing Systems

A standard method to obtain stochastic models for symbolic time series is to train state-emitting hidden Markov models (SE-HMMs) with the Baum-Welch algorithm. Based on observable operator models (OOMs), in the last few months a number of novel learning algorithms for similar purposeshave been developed: (1,2) two versions of an "efficiency sharpening" (ES) algorithm, which iteratively improves the statistical efficiency ofa sequence of OOM estimators, (3) a constrained gradient descent ML estimator for transition-emitting HMMs (TE-HMMs). We give an overview on these algorithms and compare them with SE-HMM/EM learning on synthetic and real-life data.


Learning Cue-Invariant Visual Responses

Neural Information Processing Systems

Multiple visual cues are used by the visual system to analyze a scene; achromatic cues include luminance, texture, contrast and motion. Singlecell recordingshave shown that the mammalian visual cortex contains neurons that respond similarly to scene structure (e.g., orientation of a boundary), regardless of the cue type conveying this information. This paper shows that cue-invariant response properties of simple-and complex-type cells can be learned from natural image data in an unsupervised manner.In order to do this, we also extend a previous conceptual model of cue invariance so that it can be applied to model simple-and complex-cell responses. Our results relate cue-invariant response properties tonatural image statistics, thereby showing how the statistical modeling approachcan be used to model processing beyond the elemental response properties visual neurons. This work also demonstrates how to learn, from natural image data, more sophisticated feature detectors than those based on changes in mean luminance, thereby paving the way for new data-driven approaches to image processing and computer vision.


A Probabilistic Interpretation of SVMs with an Application to Unbalanced Classification

Neural Information Processing Systems

In this paper, we show that the hinge loss can be interpreted as the neg-log-likelihood of a semi-parametric model of posterior probabilities. From this point of view, SVMs represent the parametric component of a semi-parametric model fitted by a maximum a posteriori estimation procedure. Thisconnection enables to derive a mapping from SVM scores to estimated posterior probabilities. Unlike previous proposals, the suggested mappingis interval-valued, providing a set of posterior probabilities compatible with each SVM score. This framework offers a new way to adapt the SVM optimization problem to unbalanced classification, whendecisions result in unequal (asymmetric) losses. Experiments show improvements over state-of-the-art procedures.


Pattern Recognition from One Example by Chopping

Neural Information Processing Systems

We investigate the learning of the appearance of an object from a single image of it. Instead of using a large number of pictures of the object to recognize, we use a labeled reference database of pictures of other objects tolearn invariance to noise and variations in pose and illumination. This acquired knowledge is then used to predict if two pictures of new objects, which do not appear on the training pictures, actually display the same object. We propose a generic scheme called chopping to address this task. It relies on hundreds of random binary splits of the training set chosen to keep together the images of any given object. Those splits are extended to the complete image space with a simple learning algorithm. Given two images, the responses of the split predictors are combined with a Bayesian rule into a posterior probability of similarity.