Goto

Collaborating Authors

 Statistical Learning



Active Learning For Identifying Function Threshold Boundaries

Neural Information Processing Systems

We present an efficient algorithm to actively select queries for learning the boundaries separating a function domain into regions where the function isabove and below a given threshold. We develop experiment selection methodsbased on entropy, misclassification rate, variance, and their combinations, and show how they perform on a number of data sets. We then show how these algorithms are used to determine simultaneously valid 1 ฮฑ confidence intervals for seven cosmological parameters. Experimentation showsthat the algorithm reduces the computation necessary for the parameter estimation problem by an order of magnitude.



A General and Efficient Multiple Kernel Learning Algorithm

Neural Information Processing Systems

While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lankriet et al. (2004) considered conic combinations of kernel matrices for classification, leadingto a convex quadratically constraint quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover,we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimentalresults show that the proposed algorithm helps for automatic model selection, improving the interpretability of the learning resultand works for hundred thousands of examples or hundreds of kernels to be combined.


Correcting sample selection bias in maximum entropy density estimation

Neural Information Processing Systems

We study the problem of maximum entropy density estimation in the presence of known sample selection bias. We propose three bias correction approaches.The first one takes advantage of unbiased sufficient statistics which can be obtained from biased samples. The second one estimates thebiased distribution and then factors the bias out. The third one approximates the second by only using samples from the sampling distribution. Weprovide guarantees for the first two approaches and evaluate the performance of all three approaches in synthetic experiments and on real data from species habitat modeling, where maxent has been successfully appliedand where sample selection bias is a significant problem.


Noise and the two-thirds power Law

Neural Information Processing Systems

The two-thirds power law, an empirical law stating an inverse nonlinear relationship between the tangential hand speed and the curvature of its trajectory during curved motion, is widely acknowledged to be an invariant ofupper-limb movement. It has also been shown to exist in eyemotion, locomotionand was even demonstrated in motion perception and prediction. This ubiquity has fostered various attempts to uncover the origins of this empirical relationship. In these it was generally attributed eitherto smoothness in hand-or joint-space or to the result of mechanisms that damp noise inherent in the motor system to produce the smooth trajectories evident in healthy human motion. We show here that white Gaussian noise also obeys this power-law. Analysis ofsignal and noise combinations shows that trajectories that were synthetically created not to comply with the power-law are transformed to power-law compliant ones after combination with low levels of noise. Furthermore, there exist colored noise types that drive non-power-law trajectories to power-law compliance and are not affected by smoothing. These results suggest caution when running experiments aimed at verifying thepower-law or assuming its underlying existence without proper analysis of the noise. Our results could also suggest that the power-law might be derived not from smoothness or smoothness-inducing mechanisms operatingon the noise inherent in our motor system but rather from the correlated noise which is inherent in this motor system.



Nested sampling for Potts models

Neural Information Processing Systems

Nested sampling is a new Monte Carlo method by Skilling [1] intended forgeneral Bayesian computation. Nested sampling provides a robust alternative to annealing-based methods for computing normalizing constants. It can also generate estimates of other quantities such as posterior expectations. The key technical requirement isan ability to draw samples uniformly from the prior subject to a constraint on the likelihood. We provide a demonstration withthe Potts model, an undirected graphical model.



Infinite latent feature models and the Indian buffet process

Neural Information Processing Systems

We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution is suitable for use as a prior in probabilistic models that represent objects using a potentially infinite array of features. We identify a simple generative process that results in the same distribution overequivalence classes, which we call the Indian buffet process. We illustrate the use of this distribution as a prior in an infinite latent feature model,deriving a Markov chain Monte Carlo algorithm for inference in this model and applying the algorithm to an image dataset.