AITopics | crowdsourced annotation

Collaborating Authors

crowdsourced annotation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Soft-Label Integration for Robust Toxicity Classification

Neural Information Processing SystemsMar-22-2026, 01:02:36 GMT

Toxicity classification in textual content remains a significant problem.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

Add feedback

Soft-Label Integration for Robust Toxicity Classification

Neural Information Processing SystemsMay-27-2025, 12:21:40 GMT

Toxicity classification in textual content remains a significant problem. Therefore, there is a growing need to incorporate crowdsourced annotations for training an effective toxicity classifier. Additionally, the standard approach to training a classifier using empirical risk minimization (ERM) may fail to address the potential shifts between the training set and testing set due to exploiting spurious correlations. This work introduces a novel bi-level optimization framework that integrates crowdsourced annotations with the soft-labeling technique and optimizes the soft-label weights by Group Distributionally Robust Optimization (GroupDRO) to enhance the robustness against out-of-distribution (OOD) risk. We theoretically prove the convergence of our bi-level optimization algorithm. Experimental results demonstrate that our approach outperforms existing baseline methods in terms of both average and worst-group accuracy, confirming its effectiveness in leveraging crowdsourced annotations to achieve more effective and robust toxicity classification.

crowdsourced annotation, robust toxicity classification, soft-label integration, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.64)

Add feedback

Reviews: Semi-crowdsourced Clustering with Deep Generative Models

Neural Information Processing SystemsOct-7-2024, 08:42:28 GMT

A complex DGM is proposed that jointly models observations with crowdsourced annotations of whether or not two observations belong to the same cluster. This allows crowdsourcing non-expert annotations to help with clustering complex data. Importantly, the model is developed for the semi-supervised case, i.e., annotations are only observed for a small proportion of observation pairs. The authors propose a hierarchical VAE structure to model the observations, with a discrete latent-variable z \sim p(z \pi), a continuous latent variable x \sim p(x z), and observed data o \sim p(o x). This is paired with a two-coin David-Skene model which is conditioned on the mixture variable z for annotations: L \sim p(L z_i, z_j, \alpha, \beta), where \alpha and \beta are annotator-specific latent variables that model the "expertise" of the m_th annotator (precision and recall parameters, respectively). To the best of my understanding, through the dependence of the two-coin model on the latent mixture association, though it is not explicitly stated in the paper, z represents cluster association in the model.

annotation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Robust Active Learning Using Crowdsourced Annotations for Activity Recognition

Zhao, Liyue (University of Central Florida) | Sukthankar, Gita (University of Central Florida) | Sukthankar, Rahul (Carnegie Mellon University)

AAAI ConferencesAug-8-2011

Recognizing human activities from wearable sensor data is an important problem, particularly for health and eldercare applications. However, collecting sufficient labeled training data is challenging, especially since interpreting IMU traces is difficult for human annotators. Recently, crowdsourcing through services such as Amazon's Mechanical Turk has emerged as a promising alternative for annotating such data, with active learning serving as a natural method for affordably selecting an appropriate subset of instances to label. Unfortunately, since most active learning strategies are greedy methods that select the most uncertain sample, they are very sensitive to annotation errors (which corrupt a significant fraction of crowdsourced labels). This paper proposes methods for robust active learning under these conditions. Specifically, we make three contributions: 1) we obtain better initial labels by asking labelers to solve a related task; 2) we propose a new principled method for selecting instances in active learning that is more robust to annotation noise; 3) we estimate confidence scores for labels acquired from MTurk and ask workers to relabel samples that receive low scores under this metric. The proposed method is shown to significantly outperform existing techniques both under controlled noise conditions and in real active learning scenarios. The resulting method trains classifiers that are close in accuracy to those trained using ground-truth data.

artificial intelligence, machine learning, social media, (20 more...)

AAAI Conferences

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (0.35)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback