Goto

Collaborating Authors

 Education


Proposal of Grade Training Method in Private Crowdsourcing System

AAAI Conferences

Current crowdsourcing platforms such as Amazon Mechanical Turk provide an attractive solution for processing of high-volume tasks at low cost. However, problems of quality control remain a major concern. We developed a private crowdsourcing system (PCSS) running in a intranetwork, that allow us to devise for quality control methods. In the present work, we designed a novel task allocation method to improve accuracy of task results in PCSS. PCSS analyzed relations between tasks from workers' behavior using Bayesian network, then created learning tasks according to analyzed relations. PCSS increased quality of task results by allocating learning tasks to workers before processing difficult tasks. PCSS created 8 learning tasks automatically for 2 target task categories and increased accuracy of task results by 10.77 point on average. We found that creating learning tasks according to analyzed relations is a practical method to improve the quality of workers.


Tropel: Crowdsourcing Detectors with Minimal Training

AAAI Conferences

This paper introduces the Tropel system which enables non-technical users to create arbitrary visual detectors without first annotating a training set. Our primary contribution is a crowd active learning pipeline that is seeded with only a single positive example and an unlabeled set of training images. We examine the crowd's ability to train visual detectors given severely limited training themselves. This paper presents a series of experiments that reveal the relationship between worker training, worker consensus and the average precision of detectors trained by crowd-in-the-loop active learning. In order to verify the efficacy of our system, we train detectors for bird species that work nearly as well as those trained on the exhaustively labeled CUB 200 dataset at significantly lower cost and with little effort from the end user. To further illustrate the usefulness of our pipeline, we demonstrate qualitative results on unlabeled datasets containing fashion images and street-level photographs of Paris.


Crowdlines: Supporting Synthesis of Diverse Information Sources through Crowdsourced Outlines

AAAI Conferences

Learning about a new area of knowledge is challenging for novices partly because they are not yet aware of which topics are most important. The Internet contains a wealth of information for learning the underlying structure of a domain, but relevant sources often have diverse structures and emphases, making it hard to discern what is widely considered essential knowledge vs. what is idiosyncratic. Crowdsourcing offers a potential solution because humans are skilled at evaluating high-level structure, but most crowd micro-tasks provide limited context and time. To address these challenges, we present Crowdlines, a system that uses crowdsourcing to help people synthesize diverse online information. Crowdworkers make connections across sources to produce a rich outline that surfaces diverse perspectives within important topics. We evaluate Crowdlines with two experiments. The first experiment shows that a high context, low structure interface helps crowdworkers perform faster, higher quality synthesis, while the second experiment shows that a tournament-style (parallelized) crowd workflow produces faster, higher quality, more diverse outlines than a linear (serial/iterative) workflow.  


From "In" to "Over": Behavioral Experiments on Whole-Network Computation

AAAI Conferences

We report on a series of behavioral experiments in human computation on three different tasks over networks: graph coloring, community detection (or graph clustering), and competitive contagion. While these tasks share similar action spaces and interfaces, they capture a diversity of computational challenges: graph coloring is a search problem, clustering is an optimization problem, and competitive contagion is a game-theoretic problem. In contrast with most of the prior literature on human-subject experiments in networks, in which collectives of subjects are embedded "in" the network, and have only local information and interactions, here individual subjects have a global (or "over") view and must solve "whole network" problems alone. Our primary findings are that subject performance is impressive across all three problem types; that subjects find diverse and novel strategies for solving each task; and that collective performance can often be strongly correlated with known algorithms.


ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient

arXiv.org Machine Learning

Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the noise in stochastic estimates of the gradients. In this paper, we propose a new adaptive learning rate algorithm, which utilizes curvature information for automatically tuning the learning rates. The information about the element-wise curvature of the loss function is estimated from the local statistics of the stochastic first order gradients. We further propose a new variance reduction technique to speed up the convergence. In our preliminary experiments with deep neural networks, we obtained better performance compared to the popular stochastic gradient algorithms.


A Unified Framework for Representation-based Subspace Clustering of Out-of-sample and Large-scale Data

arXiv.org Machine Learning

Under the framework of spectral clustering, the key of subspace clustering is building a similarity graph which describes the neighborhood relations among data points. Some recent works build the graph using sparse, low-rank, and $\ell_2$-norm-based representation, and have achieved state-of-the-art performance. However, these methods have suffered from the following two limitations. First, the time complexities of these methods are at least proportional to the cube of the data size, which make those methods inefficient for solving large-scale problems. Second, they cannot cope with out-of-sample data that are not used to construct the similarity graph. To cluster each out-of-sample datum, the methods have to recalculate the similarity graph and the cluster membership of the whole data set. In this paper, we propose a unified framework which makes representation-based subspace clustering algorithms feasible to cluster both out-of-sample and large-scale data. Under our framework, the large-scale problem is tackled by converting it as out-of-sample problem in the manner of "sampling, clustering, coding, and classifying". Furthermore, we give an estimation for the error bounds by treating each subspace as a point in a hyperspace. Extensive experimental results on various benchmark data sets show that our methods outperform several recently-proposed scalable methods in clustering large-scale data set.


Multimodal Task-Driven Dictionary Learning for Image Classification

arXiv.org Machine Learning

Dictionary learning algorithms have been successfully used for both reconstructive and discriminative tasks, where an input signal is represented with a sparse linear combination of dictionary atoms. While these methods are mostly developed for single-modality scenarios, recent studies have demonstrated the advantages of feature-level fusion based on the joint sparse representation of the multimodal inputs. In this paper, we propose a multimodal task-driven dictionary learning algorithm under the joint sparsity constraint (prior) to enforce collaborations among multiple homogeneous/heterogeneous sources of information. In this task-driven formulation, the multimodal dictionaries are learned simultaneously with their corresponding classifiers. The resulting multimodal dictionaries can generate discriminative latent features (sparse codes) from the data that are optimized for a given task such as binary or multiclass classification. Moreover, we present an extension of the proposed formulation using a mixed joint and independent sparsity prior which facilitates more flexible fusion of the modalities at feature level. The efficacy of the proposed algorithms for multimodal classification is illustrated on four different applications -- multimodal face recognition, multi-view face recognition, multi-view action recognition, and multimodal biometric recognition. It is also shown that, compared to the counterpart reconstructive-based dictionary learning algorithms, the task-driven formulations are more computationally efficient in the sense that they can be equipped with more compact dictionaries and still achieve superior performance.


Spectral Convergence Rate of Graph Laplacian

arXiv.org Machine Learning

Laplacian Eigenvectors of the graph constructed from a data set are used in many spectral manifold learning algorithms such as diffusion maps and spectral clustering. Given a graph constructed from a random sample of a d-dimensional compact submanifold M in R D, we establish the spectral convergence rate of the graph Laplacian. It implies the consistency of the spectral clustering algorithm via a standard perturbation argument. A simple numerical study indicates the necessity of a denoising step before applying spectral algorithms. 1. Introduction High-dimensional data appears naturally in real-world applications. A common assumption is that the data resides on a low-dimensional manifold.


Online Learning with Gaussian Payoffs and Side Observations

arXiv.org Machine Learning

We consider a sequential learning problem with Gaussian payoffs and side information: after selecting an action $i$, the learner receives information about the payoff of every action $j$ in the form of Gaussian observations whose mean is the same as the mean payoff, but the variance depends on the pair $(i,j)$ (and may be infinite). The setup allows a more refined information transfer from one action to another than previous partial monitoring setups, including the recently introduced graph-structured feedback case. For the first time in the literature, we provide non-asymptotic problem-dependent lower bounds on the regret of any algorithm, which recover existing asymptotic problem-dependent lower bounds and finite-time minimax lower bounds available in the literature. We also provide algorithms that achieve the problem-dependent lower bound (up to some universal constant factor) or the minimax lower bounds (up to logarithmic factors).


Efficient Learning by Directed Acyclic Graph For Resource Constrained Prediction

arXiv.org Machine Learning

We study the problem of reducing test-time acquisition costs in classification systems. Our goal is to learn decision rules that adaptively select sensors for each example as necessary to make a confident prediction. We model our system as a directed acyclic graph (DAG) where internal nodes correspond to sensor subsets and decision functions at each node choose whether to acquire a new sensor or classify using the available measurements. This problem can be naturally posed as an empirical risk minimization over training data. Rather than jointly optimizing such a highly coupled and non-convex problem over all decision nodes, we propose an efficient algorithm motivated by dynamic programming. We learn node policies in the DAG by reducing the global objective to a series of cost sensitive learning problems. Our approach is computationally efficient and has proven guarantees of convergence to the optimal system for a fixed architecture. In addition, we present an extension to map other budgeted learning problems with large number of sensors to our DAG architecture and demonstrate empirical performance exceeding state-of-the-art algorithms for data composed of both few and many sensors.