### Bayesian Bias Mitigation for Crowdsourcing

Biased labelers are a systemic problem in crowdsourcing, and a comprehensive toolbox for handling their responses is still being developed. A typical crowdsourcing application can be divided into three steps: data collection, data curation, and learning. At present these steps are often treated separately. We present Bayesian Bias Mitigation for Crowdsourcing (BBMC), a Bayesian model to unify all three. Most data curation methods account for the {\it effects} of labeler bias by modeling all labels as coming from a single latent truth.

### Active Learning for Crowdsourcing Using Knowledge Transfer

This paper studies the active learning problem in crowdsourcing settings, where multiple imperfect annotators with varying levels of expertise are available for labeling the data in a given task. Annotations collected from these labelers may be noisy and unreliable, and the quality of labeled data needs to be maintained for data mining tasks. Previous solutions have attempted to estimate individual users' reliability based on existing knowledge in each task, but for this to be effective each task requires a large quantity of labeled data to provide accurate estimates. In practice, annotation budgets for a given task are limited, so each instance can be presented to only a few users, each of whom can only label a few examples. To overcome data scarcity we propose a new probabilistic model that transfers knowledge from abundant unlabeled data in auxiliary domains to help estimate labelers' expertise. Based on this model we present a novel active learning algorithm that: a) simultaneously selects the most informative example and b) queries its label from the labeler with the best expertise. Experiments on both text and image datasets demonstrate that our proposed method outperforms other state-of-the-art active learning methods.

### Selective Sampling of Labelers for Approximating the Crowd

In this paper, we present CrowdSense, an algorithm for estimating the crowd’s majority opinion by querying only a subset of it. CrowdSense works in an online fashion where examples come one at a time and it dynamically samples subsets of labelers based on an exploration/exploitation criterion. The algorithm produces a weighted combination of a subset of the labelers’ votes that approximates the crowd’s opinion. We also present two probabilistic variants of CrowdSense that are based on different assumptions on the joint probability distribution between the labelers’ votes and the majority vote. Our experiments demonstrate that we can reliably approximate the entire crowd’s vote by collecting opinions from a representative subset of the crowd.

### An Interactive Multi-Label Consensus Labeling Model for Multiple Labeler Judgments

Multi-label classification is crucial to several practical applications including document categorization, video tagging, targeted advertising etc. Training a multi-label classifier requires a large amount of labeled data which is often unavailable or scarce. Labeled data is then acquired by consulting multiple labelers---both human and machine. Inspired by ensemble methods, our premise is that labels inferred with high consensus among labelers, might be closer to the ground truth. We propose strategies based on interaction and active learning to obtain higher quality labels that potentially lead to greater consensus. We propose a novel formulation that aims to collectively optimize the cost of labeling, labeler reliability, label-label correlation and inter-labeler consensus. Evaluation on data labeled by multiple labelers (both human and machine) shows that our consensus output is closer to the ground truth when compared to the "majority" baseline. We present illustrative cases where it even improves over the existing ground truth. We also present active learning strategies to leverage our consensus model in interactive learning settings. Experiments on several real-world datasets (publicly available) demonstrate the efficacy of our approach in achieving promising classification results with fewer labeled data.

### Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise

Modern machine learning-based approaches to computer vision require very large databases of labeled images. Some contemporary vision systems already require on the order of millions of images for training (e.g., Omron face detector). While the collection of these large databases is becoming a bottleneck, new Internet-based services that allow labelers from around the world to be easily hired and managed provide a promising solution. However, using these services to label large databases brings with it new theoretical and practical challenges: (1) The labelers may have wide ranging levels of expertise which are unknown a priori, and in some cases may be adversarial; (2) images may vary in their level of difficulty; and (3) multiple labels for the same image must be combined to provide an estimate of the actual label of the image. Probabilistic approaches provide a principled way to approach these problems. In this paper we present a probabilistic model and use it to simultaneously infer the label of each image, the expertise of each labeler, and the difficulty of each image. On both simulated and real data, we demonstrate that the model outperforms the commonly used Majority Vote heuristic for inferring image labels, and is robust to both adversarial and noisy labelers.