Goto

Collaborating Authors

 Asia


Learning Supervised Topic Models from Crowds

AAAI Conferences

The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this paper, we propose a supervised topic model that accounts for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state of the art approaches.


Identifying and Accounting for Task-Dependent Bias in Crowdsourcing

AAAI Conferences

Models for aggregating contributions by crowd workers have been shown to be challenged by the rise of task-specific biases or errors. Task-dependent errors in assessment may shift the majority opinion of even large numbers of workers to an incorrect answer. We introduce and evaluate probabilistic models that can detect and correct task-dependent bias automatically. First, we show how to build and use probabilistic graphical models for jointly modeling task features, workers' biases, worker contributions and ground truth answers of tasks so that task-dependent bias can be corrected. Second, we show how the approach can perform a type of transfer learning among workers to address the issue of annotation sparsity. We evaluate the models with varying complexity on a large data set collected from a citizen science project and show that the models are effective at correcting the task-dependent worker bias. Finally, we investigate the use of active learning to guide the acquisition of expert assessments to enable automatic detection and correction of worker bias.


PISCES: Participatory Incentive Strategies for Effective Community Engagement in Smart Cities

AAAI Conferences

A key challenge in participatory sensing systems has been the design of incentive mechanisms that motivate individuals to contribute data to consuming applications. Emerging trends in urban development and smart city planning indicate the use of citizen reports to gather insights and identify areas for transformation. Consumers of these reports (e.g. city agencies) typically associate non-uniform utility (or values) to different reports based on the spatio-temporal context of the reports. For example, a report indicating traffic congestion near an airport, in early morning hours, would tend to have much higher utility than a similar report from a sparse residential area. In such cases, the design of an incentive mechanism must motivate participants, via appropriate rewards (or payments), to provide higher utility reports when compared to less valued ones. The main challenge in designing such an incentive scheme is two-fold: (i) lack of prior knowledge of participants in terms of their availability (i.e. who are in the vicinity) and reporting behaviour (i.e. what are the rewards expected); and (ii) minimizing payments to the reporters while ensuring that the desired number of reports are collected. In this paper, we propose STOC-PISCES, an algorithm that guarantees a stochastic optimal solution in the generalized setting of an unknown set of participants, with non-deterministic availabilities and stochastically rational reporting behaviour. The superior performance of STOC-PISCES in experimental settings, based on real-world data, endorses its adoption as an incentive strategy in participatory sensing applications like smart city management.


Sparse Variational Bayesian Approximations for Nonlinear Inverse Problems: applications in nonlinear elastography

arXiv.org Machine Learning

This paper presents an efficient Bayesian framework for solving nonlinear, high-dimensional model calibration problems. It is based on a Variational Bayesian formulation that aims at approximating the exact posterior by means of solving an optimization problem over an appropriately selected family of distributions. The goal is two-fold. Firstly, to find lower-dimensional representations of the unknown parameter vector that capture as much as possible of the associated posterior density, and secondly to enable the computation of the approximate posterior density with as few forward calls as possible. We discuss how these objectives can be achieved by using a fully Bayesian argumentation and employing the marginal likelihood or evidence as the ultimate model validation metric for any proposed dimensionality reduction. We demonstrate the performance of the proposed methodology for problems in nonlinear elastography where the identification of the mechanical properties of biological materials can inform non-invasive, medical diagnosis. An Importance Sampling scheme is finally employed in order to validate the results and assess the efficacy of the approximations provided.


A Unified Framework for Representation-based Subspace Clustering of Out-of-sample and Large-scale Data

arXiv.org Machine Learning

Under the framework of spectral clustering, the key of subspace clustering is building a similarity graph which describes the neighborhood relations among data points. Some recent works build the graph using sparse, low-rank, and $\ell_2$-norm-based representation, and have achieved state-of-the-art performance. However, these methods have suffered from the following two limitations. First, the time complexities of these methods are at least proportional to the cube of the data size, which make those methods inefficient for solving large-scale problems. Second, they cannot cope with out-of-sample data that are not used to construct the similarity graph. To cluster each out-of-sample datum, the methods have to recalculate the similarity graph and the cluster membership of the whole data set. In this paper, we propose a unified framework which makes representation-based subspace clustering algorithms feasible to cluster both out-of-sample and large-scale data. Under our framework, the large-scale problem is tackled by converting it as out-of-sample problem in the manner of "sampling, clustering, coding, and classifying". Furthermore, we give an estimation for the error bounds by treating each subspace as a point in a hyperspace. Extensive experimental results on various benchmark data sets show that our methods outperform several recently-proposed scalable methods in clustering large-scale data set.


Filtering with State-Observation Examples via Kernel Monte Carlo Filter

arXiv.org Machine Learning

This paper addresses the problem of filtering with a state-space model. Standard approaches for filtering assume that a probabilistic model for observations (i.e. the observation model) is given explicitly or at least parametrically. We consider a setting where this assumption is not satisfied; we assume that the knowledge of the observation model is only provided by examples of state-observation pairs. This setting is important and appears when state variables are defined as quantities that are very different from the observations. We propose Kernel Monte Carlo Filter, a novel filtering method that is focused on this setting. Our approach is based on the framework of kernel mean embeddings, which enables nonparametric posterior inference using the state-observation examples. The proposed method represents state distributions as weighted samples, propagates these samples by sampling, estimates the state posteriors by Kernel Bayes' Rule, and resamples by Kernel Herding. In particular, the sampling and resampling procedures are novel in being expressed using kernel mean embeddings, so we theoretically analyze their behaviors. We reveal the following properties, which are similar to those of corresponding procedures in particle methods: (1) the performance of sampling can degrade if the effective sample size of a weighted sample is small; (2) resampling improves the sampling performance by increasing the effective sample size. We first demonstrate these theoretical findings by synthetic experiments. Then we show the effectiveness of the proposed filter by artificial and real data experiments, which include vision-based mobile robot localization.


Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

arXiv.org Machine Learning

We propose a novel method for multiple clustering that assumes a co-clustering structure (partitions in both rows and columns of the data matrix) in each view. The new method is applicable to high-dimensional data. It is based on a nonparametric Bayesian approach in which the number of views and the number of feature-/subject clusters are inferred in a data-driven manner. We simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block. This makes our method applicable to datasets consisting of both numerical and categorical variables, which biomedical data typically do. Clustering solutions are based on variational inference with mean field approximation. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.


Fast and Guaranteed Tensor Decomposition via Sketching

arXiv.org Machine Learning

Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent variable models and in data mining. In this paper, we propose fast and randomized tensor CP decomposition algorithms based on sketching. We build on the idea of count sketches, but introduce many novel ideas which are unique to tensors. We develop novel methods for randomized computation of tensor contractions via FFTs, without explicitly forming the tensors. Such tensor contractions are encountered in decomposition methods such as tensor power iterations and alternating least squares. We also design novel colliding hashes for symmetric tensors to further save time in computing the sketches. We then combine these sketching ideas with existing whitening and tensor power iterative techniques to obtain the fastest algorithm on both sparse and dense tensors. The quality of approximation under our method does not depend on properties such as sparsity, uniformity of elements, etc. We apply the method for topic modeling and obtain competitive results.


Accelerometer based Activity Classification with Variational Inference on Sticky HDP-SLDS

arXiv.org Machine Learning

As part of daily monitoring of human activities, wearable sensors and devices are becoming increasingly popular sources of data. With the advent of smartphones equipped with acceloremeter, gyroscope and camera; it is now possible to develop activity classification platforms everyone can use conveniently. In this paper, we propose a fast inference method for an unsupervised non-parametric time series model namely variational inference for sticky HDP-SLDS(Hierarchical Dirichlet Process Switching Linear Dynamical System). We show that the proposed algorithm can differentiate various indoor activities such as sitting, walking, turning, going up/down the stairs and taking the elevator using only the acceloremeter of an Android smartphone Samsung Galaxy S4. We used the front camera of the smartphone to annotate activity types precisely. We compared the proposed method with Hidden Markov Models with Gaussian emission probabilities on a dataset of 10 subjects. We showed that the efficacy of the stickiness property. We further compared the variational inference to the Gibbs sampler on the same model and show that variational inference is faster in one order of magnitude.


Estimating Posterior Ratio for Classification: Transfer Learning from Probabilistic Perspective

arXiv.org Machine Learning

Transfer learning assumes classifiers of similar tasks share certain parameter structures. Unfortunately, modern classifiers uses sophisticated feature representations with huge parameter spaces which lead to costly transfer. Under the impression that changes from one classifier to another should be ``simple'', an efficient transfer learning criteria that only learns the ``differences'' is proposed in this paper. We train a \emph{posterior ratio} which turns out to minimizes the upper-bound of the target learning risk. The model of posterior ratio does not have to share the same parameter space with the source classifier at all so it can be easily modelled and efficiently trained. The resulting classifier therefore is obtained by simply multiplying the existing probabilistic-classifier with the learned posterior ratio.