Goto

Collaborating Authors

 Country


Generalization error bounds for classifiers trained with interdependent data

Neural Information Processing Systems

In this paper we propose a general framework to study the generalization properties of binary classifiers trained with data which may be dependent, butare deterministically generated upon a sample of independent examples. It provides generalization bounds for binary classification and some cases of ranking problems, and clarifies the relationship between these learning tasks.




Active Learning for Misspecified Models

Neural Information Processing Systems

Active learning is the problem in supervised learning to design the locations oftraining input points so that the generalization error is minimized. Existing active learning methods often assume that the model used for learning is correctly specified, i.e., the learning target function can be expressed bythe model at hand. In many practical situations, however, this assumption may not be fulfilled. In this paper, we first show that the existing activelearning method can be theoretically justified under slightly weaker condition: the model does not have to be correctly specified, but slightly misspecified models are also allowed. However, it turns out that the weakened condition is still restrictive in practice.


Describing Visual Scenes using Transformed Dirichlet Processes

Neural Information Processing Systems

Motivated by the problem of learning to detect and recognize objects with minimal supervision, we develop a hierarchical probabilistic model for the spatial structure of visual scenes. In contrast with most existing models, our approach explicitly captures uncertainty in the number of object instances depicted in a given image. Our scene model is based on the transformed Dirichlet process (TDP), a novel extension of the hierarchical DPin which a set of stochastically transformed mixture components are shared between multiple groups of data. For visual scenes, mixture components describe the spatial structure of visual features in an object-centered coordinate frame, while transformations model the object positionsin a particular image. Learning and inference in the TDP, which has many potential applications beyond computer vision, is based on an empirically effective Gibbs sampler. Applied to a dataset of partially labeledstreet scenes, we show that the TDP's inclusion of spatial structure improves detection performance, flexibly exploiting partially labeled training images.


Sensory Adaptation within a Bayesian Framework for Perception

Neural Information Processing Systems

We extend a previously developed Bayesian framework for perception to account for sensory adaptation. We first note that the perceptual effects ofadaptation seems inconsistent with an adjustment of the internally represented prior distribution. Instead, we postulate that adaptation increases the signal-to-noise ratio of the measurements by adapting the operational range of the measurement stage to the input range. We show that this changes the likelihood function in such a way that the Bayesian estimator model can account for reported perceptual behavior. In particular, wecompare the model's predictions to human motion discrimination data and demonstrate that the model accounts for the commonly observed perceptual adaptation effects of repulsion and enhanced discriminability.


Prediction and Change Detection

Neural Information Processing Systems

We measure the ability of human observers to predict the next datum in a sequence that is generated by a simple statistical process undergoing change at random points in time. Accurate performance in this task requires the identification of changepoints. We assess individual differences between observers both empirically, and using two kinds of models: a Bayesian approach for change detection and a family of cognitively plausible fast and frugal models. Some individuals detect too many changes and hence perform sub-optimally due to excess variability. Other individuals do not detect enough changes, and perform sub-optimally because they fail to notice short-term temporal trends.



Selecting Landmark Points for Sparse Manifold Learning

Neural Information Processing Systems

There has been a surge of interest in learning nonlinear manifold models to approximate high-dimensional data. Both for computational complexity reasonsand for generalization capability, sparsity is a desired feature in such models. This usually means dimensionality reduction, which naturally implies estimating the intrinsic dimension, but it can also mean selecting a subset of the data to use as landmarks, which is especially important becausemany existing algorithms have quadratic complexity in the number of observations.


Learning Minimum Volume Sets

Neural Information Processing Systems

Given a probability measure P and a reference measure µ, one is often interested in the minimum µ-measure set with P-measure at least α. Minimum volume sets of this type summarize the regions of greatest probability mass of P, and are useful for detecting anomalies andconstructing confidence regions. This paper addresses the problem of estimating minimum volume sets based on independent samples distributed according to P. Other than these samples, no other information is available regarding P, but the reference measure µis assumed to be known. We introduce rules for estimating minimum volume sets that parallel the empirical risk minimization and structural risk minimization principles in classification. As in classification, we show that the performances of our estimators are controlled by the rate of uniform convergence of empirical to true probabilities over the class from which the estimator is drawn. Thus we obtain finite sample size performance bounds in terms of VC dimension and related quantities. We also demonstrate strong universal consistency and an oracle inequality. Estimators based on histograms and dyadic partitions illustrate the proposed rules.