Goto

Collaborating Authors

 Country


Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery

Neural Information Processing Systems

In this paper we consider the problem of finding sets of points that conform toa given underlying model from within a dense, noisy set of observations. Thisproblem is motivated by the task of efficiently linking faint asteroid detections, but is applicable to a range of spatial queries. We survey current tree-based approaches, showing a tradeoff exists between singletree and multiple tree algorithms. To this end, we present a new type of multiple tree algorithm that uses a variable number of trees to exploit the advantages of both approaches. We empirically show that this algorithm performs well using both simulated and astronomical data.


Generalization in Clustering with Unobserved Features

Neural Information Processing Systems

We argue that when objects are characterized by many attributes, clustering themon the basis of a relatively small random subset of these attributes can capture information on the unobserved attributes as well. Moreover, we show that under mild technical conditions, clustering the objects on the basis of such a random subset performs almost as well as clustering with the full attribute set. We prove a finite sample generalization theoremsfor this novel learning scheme that extends analogous results from the supervised learning setting. The scheme is demonstrated for collaborative filtering of users with movies rating as attributes.


Measuring Shared Information and Coordinated Activity in Neuronal Networks

Neural Information Processing Systems

This activity often manifests itself as dynamically coordinated sequences of action potentials. Since multiple electrode recordings are now a standard tool in neuroscience research, it is important to have a measure of such network-wide behavioral coordinationand information sharing, applicable to multiple neural spike train data. We propose a new statistic, informational coherence, which measures how much better one unit can be predicted by knowing the dynamical state of another. We argue informational coherence is a measure of association and shared information which is superior to traditional pairwisemeasures of synchronization and correlation. To find the dynamical states, we use a recently-introduced algorithm which reconstructs effectivestate spaces from stochastic time series.



Benchmarking Non-Parametric Statistical Tests

Neural Information Processing Systems

Although nonparametric tests have already been proposed for that purpose, statisticalsignificance tests for nonstandard measures (different from the classification error) are less often used in the literature. This paper is an attempt at empirically verifying how these tests compare with more classical tests, on various conditions. More precisely, using a very large dataset to estimate the whole "population", we analyzed the behavior ofseveral statistical test, varying the class unbalance, the compared models, the performance measure, and the sample size. The main result isthat providing big enough evaluation sets nonparametric tests are relatively reliable in all conditions.


A matching pursuit approach to sparse Gaussian process regression

Neural Information Processing Systems

In this paper we propose a new basis selection criterion for building sparse GP regression models that provides promising gains in accuracy as well as efficiency over previous methods. Our algorithm is much faster than that of Smola and Bartlett, while, in generalization it greatly outperforms theinformation gain approach proposed by Seeger et al, especially on the quality of predictive distributions.


Hyperparameter and Kernel Learning for Graph Based Semi-Supervised Classification

Neural Information Processing Systems

There have been many graph-based approaches for semi-supervised classification. Oneproblem is that of hyperparameter learning: performance depends greatly on the hyperparameters of the similarity graph, transformation ofthe graph Laplacian and the noise model. We present a Bayesian framework for learning hyperparameters for graph-based semisupervised classification.Given some labeled data, which can contain inaccurate labels, we pose the semi-supervised classification as an inference problemover the unknown labels. Expectation Propagation is used for approximate inference and the mean of the posterior is used for classification. The hyperparameters are learned using EM for evidence maximization. We also show that the posterior mean can be written in terms of the kernel matrix, providing a Bayesian classifier to classify new points. Tests on synthetic and real datasets show cases where there are significant improvements in performance over the existing approaches.


Worst-Case Bounds for Gaussian Process Models

Neural Information Processing Systems

Dean P. Foster University of Pennsylvania We present a competitive analysis of some nonparametric Bayesian algorithms ina worst-case online learning setting, where no probabilistic assumptions about the generation of the data are made. We consider models which use a Gaussian process prior (over the space of all functions) andprovide bounds on the regret (under the log loss) for commonly usednon-parametric Bayesian algorithms -- including Gaussian regression and logistic regression -- which show how these algorithms can perform favorably under rather general conditions.


From Batch to Transductive Online Learning

Neural Information Processing Systems

It is well-known that everything that is learnable in the difficult online setting, where an arbitrary sequences of examples must be labeled one at a time, is also learnable in the batch setting, where examples are drawn independently from a distribution. We show a result in the opposite direction. Wegive an efficient conversion algorithm from batch to online that is transductive: it uses future unlabeled data. This demonstrates the equivalence between what is properly and efficiently learnable in a batch model and a transductive online model.


Generalization Error Bounds for Aggregation by Mirror Descent with Averaging

Neural Information Processing Systems

For this purpose, we propose a stochastic procedure, the mirror descent, which performs gradient descent inthe dual space. The generated estimates are additionally averaged in a recursive fashion with specific weights. Mirror descent algorithms havebeen developed in different contexts and they are known to be particularly efficient in high dimensional problems. Moreover their implementation is adapted to the online setting. The main result of the paper is the upper bound on the convergence rate for the generalization error.