Goto

Collaborating Authors

 Country


Learning Time-Intensity Profiles of Human Activity using Non-Parametric Bayesian Models

Neural Information Processing Systems

Data sets that characterize human activity over time through collections of timestamped events or counts are of increasing interest in application areas as humancomputer interaction, video surveillance, and Web data analysis. We propose a nonparametric Bayesian framework for modeling collections of such data. In particular, we use a Dirichlet process framework for learning a set of intensity functions corresponding to different categories, which form a basis set for representing individual time-periods (e.g., several days) depending on which categories the time-periods are assigned to. This allows the model to learn in a data-driven fashion what "factors" are generating the observations on a particular day, including (for example) weekday versus weekend effects or day-specific effects corresponding to unique (single-day) occurrences of unusual behavior, sharing information where appropriate to obtain improved estimates of the behavior associated with each category. Applications to real-world data sets of count data involving both vehicles and people are used to illustrate the technique.


Differential Entropic Clustering of Multivariate Gaussians

Neural Information Processing Systems

Gaussian data is pervasive and many learning algorithms (e.g., k-means) model their inputs as a single sample drawn from a multivariate Gaussian. However, in many real-life settings, each input object is best described by multiple samples drawn from a multivariate Gaussian. Such data can arise, for example, in a movie review database where each movie is rated by several users, or in time-series domains such as sensor networks. Here, each input can be naturally described by both a mean vector and covariance matrix which parameterize the Gaussian distribution. In this paper, we consider the problem of clustering such input objects, each represented as a multivariate Gaussian. We formulate the problem using an information theoretic approach and draw several interesting theoretical connections to Bregman divergences and also Bregman matrix divergences. We evaluate our method across several domains, including synthetic data, sensor network data, and a statistical debugging application.


Aggregating Classification Accuracy across Time: Application to Single Trial EEG

Neural Information Processing Systems

We present a method for binary online classification of triggered but temporally blurred events that are embedded in noisy time series in the context of online discrimination between left and right imaginary hand-movement. In particular the goal of the binary classification problem is to obtain the decision, as fast and as reliably as possible from the recorded EEG single trials. To provide a probabilistic decision at every time-point t the presented method gathers information from two distinct sequences of features across time. In order to incorporate decisions from prior time-points we suggest an appropriate weighting scheme, that emphasizes time instances, providing a higher discriminatory power between the instantaneous class distributions of each feature, where the discriminatory power is quantified in terms of the Bayes error of misclassification. The effectiveness of this procedure is verified by its successful application in the 3rd BCI competition. Disclosure of the data after the competition revealed this approach to be superior with single trial error rates as low as 10.7, 11.5 and 16.7% for the three different subjects under study.


A Nonparametric Bayesian Method for Inferring Features From Similarity Judgments

Neural Information Processing Systems

The additive clustering model is widely used to infer the features of a set of stimuli from their similarities, on the assumption that similarity is a weighted linear function of common features. This paper develops a fully Bayesian formulation of the additive clustering model, using methods from nonparametric Bayesian statistics to allow the number of features to vary. We use this to explore several approaches to parameter estimation, showing that the nonparametric Bayesian approach provides a straightforward way to obtain estimates of both the number of features used in producing similarity judgments and their importance.



Near-Uniform Sampling of Combinatorial Spaces Using XOR Constraints

Neural Information Processing Systems

We propose a new technique for sampling the solutions of combinatorial problems in a near-uniform manner. We focus on problems specified as a Boolean formula, i.e., on SAT instances. Sampling for SAT problems has been shown to have interesting connections with probabilistic reasoning, making practical sampling algorithms for SAT highly desirable. The best current approaches are based on Markov Chain Monte Carlo methods, which have some practical limitations.


A PAC-Bayes Risk Bound for General Loss Functions

Neural Information Processing Systems

We provide a PAC-Bayesian bound for the expected loss of convex combinations of classifiers under a wide class of loss functions (which includes the exponential loss and the logistic loss). Our numerical experiments with Adaboost indicate that the proposed upper bound, computed on the training set, behaves very similarly as the true loss estimated on the testing set.



Learnability and the doubling dimension

Neural Information Processing Systems

We prove bounds on the sample complexity of PAC learning in terms of the doubling dimension of this metric. These bounds imply known bounds on the sample complexity of learning halfspaces with respect to the uniform distribution that are optimal up to a constant factor.


Manifold Denoising

Neural Information Processing Systems

The presented denoising algorithm is based on a graph-based diffusion process of the point sample. We analyze this diffusion process using recent results about the convergence of graph Laplacians. In the experiments we show that our method is capable of dealing with nontrivial high-dimensional noise. Moreover using the denoising algorithm as pre-processing method we can improve the results of a semi-supervised learning algorithm.