Collaborating Authors

Rethinking Noisy Label Models: Labeler-Dependent Noise with Adversarial Awareness Machine Learning

Most studies on learning from noisy labels rely on unrealistic models of i.i.d. label noise, such as class-conditional transition matrices. More recent work on instance-dependent noise models are more realistic, but assume a single generative process for label noise across the entire dataset. We propose a more principled model of label noise that generalizes instance-dependent noise to multiple labelers, based on the observation that modern datasets are typically annotated using distributed crowdsourcing methods. Under our labeler-dependent model, label noise manifests itself under two modalities: natural error of good-faith labelers, and adversarial labels provided by malicious actors. We present two adversarial attack vectors that more accurately reflect the label noise that may be encountered in real-world settings, and demonstrate that under our multimodal noisy labels model, state-of-the-art approaches for learning from noisy labels are defeated by adversarial label attacks. Finally, we propose a multi-stage, labeler-aware, model-agnostic framework that reliably filters noisy labels by leveraging knowledge about which data partitions were labeled by which labeler, and show that our proposed framework remains robust even in the presence of extreme adversarial label noise.

Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise

Neural Information Processing Systems

Modern machine learning-based approaches to computer vision require very large databases of labeled images. Some contemporary vision systems already require on the order of millions of images for training (e.g., Omron face detector). While the collection of these large databases is becoming a bottleneck, new Internet-based services that allow labelers from around the world to be easily hired and managed provide a promising solution. However, using these services to label large databases brings with it new theoretical and practical challenges: (1) The labelers may have wide ranging levels of expertise which are unknown a priori, and in some cases may be adversarial; (2) images may vary in their level of difficulty; and (3) multiple labels for the same image must be combined to provide an estimate of the actual label of the image. Probabilistic approaches provide a principled way to approach these problems. In this paper we present a probabilistic model and use it to simultaneously infer the label of each image, the expertise of each labeler, and the difficulty of each image. On both simulated and real data, we demonstrate that the model outperforms the commonly used ``Majority Vote heuristic for inferring image labels, and is robust to both adversarial and noisy labelers.

Selective Sampling of Labelers for Approximating the Crowd

AAAI Conferences

In this paper, we present CrowdSense, an algorithm for estimating the crowd’s majority opinion by querying only a subset of it. CrowdSense works in an online fashion where examples come one at a time and it dynamically samples subsets of labelers based on an exploration/exploitation criterion. The algorithm produces a weighted combination of a subset of the labelers’ votes that approximates the crowd’s opinion. We also present two probabilistic variants of CrowdSense that are based on different assumptions on the joint probability distribution between the labelers’ votes and the majority vote. Our experiments demonstrate that we can reliably approximate the entire crowd’s vote by collecting opinions from a representative subset of the crowd.

Active Learning for Crowdsourcing Using Knowledge Transfer

AAAI Conferences

This paper studies the active learning problem in crowdsourcing settings, where multiple imperfect annotators with varying levels of expertise are available for labeling the data in a given task. Annotations collected from these labelers may be noisy and unreliable, and the quality of labeled data needs to be maintained for data mining tasks. Previous solutions have attempted to estimate individual users' reliability based on existing knowledge in each task, but for this to be effective each task requires a large quantity of labeled data to provide accurate estimates. In practice, annotation budgets for a given task are limited, so each instance can be presented to only a few users, each of whom can only label a few examples. To overcome data scarcity we propose a new probabilistic model that transfers knowledge from abundant unlabeled data in auxiliary domains to help estimate labelers' expertise. Based on this model we present a novel active learning algorithm that: a) simultaneously selects the most informative example and b) queries its label from the labeler with the best expertise. Experiments on both text and image datasets demonstrate that our proposed method outperforms other state-of-the-art active learning methods.

Deep Learning from Crowds

AAAI Conferences

Over the last few years, deep learning has revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. However, as the size of supervised artificial neural networks grows, typically so does the need for larger labeled datasets. Recently, crowdsourcing has established itself as an efficient and cost-effective solution for labeling large sets of data in a scalable manner, but it often requires aggregating labels from multiple noisy contributors with different levels of expertise. In this paper, we address the problem of learning deep neural networks from crowds. We begin by describing an EM algorithm for jointly learning the parameters of the network and the reliabilities of the annotators. Then, a novel general-purpose crowd layer is proposed, which allows us to train deep neural networks end-to-end, directly from the noisy labels of multiple annotators, using only backpropagation. We empirically show that the proposed approach is able to internally capture the reliability and biases of different annotators and achieve new state-of-the-art results for various crowdsourced datasets across different settings, namely classification, regression and sequence labeling.