AITopics | high-quality label

Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

Neural Information Processing SystemsMar-17-2026, 02:06:45 GMT

Incentive mechanisms for crowdsourcing are designed to incentivize financially self-interested workers to generate and report high-quality labels. Existing mechanisms are often developed as one-shot static solutions, assuming a certain level of knowledge about worker models (expertise levels, costs for exerting efforts, etc.). In this paper, we propose a novel inference aided reinforcement mechanism that acquires data sequentially and requires no such prior assumptions. Specifically, we first design a Gibbs sampling augmented Bayesian inference algorithm to estimate workers' labeling strategies from the collected labels at each step. Then we propose a reinforcement incentive learning (RIL) method, building on top of the above estimates, to uncover how workers respond to different payments. RIL dynamically determines the payment without accessing any ground-truth labels. We theoretically prove that RIL is able to incentivize rational workers to provide high-quality labels both at each step and in the long run. Empirical results show that our mechanism performs consistently well under both rational and non-fully rational (adaptive learning) worker models. Besides, the payments offered by RIL are more robust and have lower variances compared to existing one-shot mechanisms.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.98)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.60)

Add feedback

Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

Neural Information Processing SystemsNov-20-2025, 23:13:10 GMT

Incentive mechanisms for crowdsourcing are designed to incentivize financially self-interested workers to generate and report high-quality labels. Existing mechanisms are often developed as one-shot static solutions, assuming a certain level of knowledge about worker models (expertise levels, costs for exerting efforts, etc.). In this paper, we propose a novel inference aided reinforcement mechanism that acquires data sequentially and requires no such prior assumptions. Specifically, we first design a Gibbs sampling augmented Bayesian inference algorithm to estimate workers' labeling strategies from the collected labels at each step. Then we propose a reinforcement incentive learning (RIL) method, building on top of the above estimates, to uncover how workers respond to different payments. RIL dynamically determines the payment without accessing any ground-truth labels. We theoretically prove that RIL is able to incentivize rational workers to provide high-quality labels both at each step and in the long run. Empirical results show that our mechanism performs consistently well under both rational and non-fully rational (adaptive learning) worker models. Besides, the payments offered by RIL are more robust and have lower variances compared to existing one-shot mechanisms.

incentive mechanism design, name change, reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.98)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.60)

Add feedback

Weakly Supervised Framework Considering Multi-temporal Information for Large-scale Cropland Mapping with Satellite Imagery

Wang, Yuze, Hu, Aoran, Qi, Ji, Liu, Yang, Tao, Chao

arXiv.org Artificial IntelligenceNov-27-2024

Accurately mapping large-scale cropland is crucial for agricultural production management and planning. Currently, the combination of remote sensing data and deep learning techniques has shown outstanding performance in cropland mapping. However, those approaches require massive precise labels, which are labor-intensive. To reduce the label cost, this study presented a weakly supervised framework considering multi-temporal information for large-scale cropland mapping. Specifically, we extract high-quality labels according to their consistency among global land cover (GLC) products to construct the supervised learning signal. On the one hand, to alleviate the overfitting problem caused by the model's over-trust of remaining errors in high-quality labels, we encode the similarity/aggregation of cropland in the visual/spatial domain to construct the unsupervised learning signal, and take it as the regularization term to constrain the supervised part. On the other hand, to sufficiently leverage the plentiful information in the samples without high-quality labels, we also incorporate the unsupervised learning signal in these samples, enriching the diversity of the feature space. After that, to capture the phenological features of croplands, we introduce dense satellite image time series (SITS) to extend the proposed framework in the temporal dimension. We also visualized the high dimensional phenological features to uncover how multi-temporal information benefits cropland extraction, and assessed the method's robustness under conditions of data scarcity. The proposed framework has been experimentally validated for strong adaptability across three study areas (Hunan Province, Southeast France, and Kansas) in large-scale cropland mapping, and the internal mechanism and temporal generalizability are also investigated.

cropland, high-quality label, study area, (16 more...)

arXiv.org Artificial Intelligence

2411.18475

Country:

Asia > China > Hunan Province (0.34)
North America > United States > Kansas (0.26)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry:

Food & Agriculture > Agriculture (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.74)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Balancing Label Quantity and Quality for Scalable Elicitation

Mallen, Alex, Belrose, Nora

arXiv.org Artificial IntelligenceOct-20-2024

Scalable oversight studies methods of training and evaluating AI systems in domains where human judgment is unreliable or expensive, such as scientific research and software engineering in complex codebases. Most work in this area has focused on methods of improving the quality of labels. Recent work by Burns et al. (2023) considers the complementary problem of training models with low-quality labels, finding that large pretrained models often have an inductive bias towards producing correct answers. In practice, however, neither label quantity nor quality is fixed: practitioners face a quantity-quality tradeoff. In this paper, we explore the microeconomics of the quantity-quality tradeoff on binary NLP classification tasks used in Burns et al. (2023). While sample-efficient learning has been studied extensively, little public research has focused on scalable elicitation: eliciting capabilities from pretrained models subject to labeling cost constraints. We find that this setting has novel dynamics caused by the tradeoff between label quantity and quality, as well as the model's existing latent capabilities. We observe three regimes of eliciting classification knowledge from pretrained models using supervised finetuning: quantity-dominant, quality-dominant, and a mixed regime involving the use of low- and high-quality data together to attain higher accuracy at a lower cost than using either alone. We explore sample-efficient elicitation methods that make use of two datasets of differing qualities, and establish a Pareto frontier of scalable elicitation methods that optimally trade off labeling cost and classifier performance. We find that the accuracy of supervised fine-tuning can be improved by up to 5 percentage points at a fixed labeling budget by adding a few-shot prompt to make use of the model's existing knowledge of the task.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.13215

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
Asia > British Indian Ocean Territory > Diego Garcia (0.04)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Mining Negative Temporal Contexts For False Positive Suppression In Real-Time Ultrasound Lesion Detection

Yu, Haojun, Li, Youcheng, Wu, QuanLin, Zhao, Ziwei, Chen, Dengbo, Wang, Dong, Wang, Liwei

arXiv.org Artificial IntelligenceJul-19-2023

During ultrasonic scanning processes, real-time lesion detection can assist radiologists in accurate cancer diagnosis. However, this essential task remains challenging and underexplored. General-purpose real-time object detection models can mistakenly report obvious false positives (FPs) when applied to ultrasound videos, potentially misleading junior radiologists. One key issue is their failure to utilize negative symptoms in previous frames, denoted as negative temporal contexts (NTC) [15]. To address this issue, we propose to extract contexts from previous frames, including NTC, with the guidance of inverse optical flow. By aggregating extracted contexts, we endow the model with the ability to suppress FPs by leveraging NTC. We call the resulting model UltraDet. The proposed UltraDet demonstrates significant improvement over previous state-of-the-arts and achieves real-time inference speed.

detection, machine learning, real time system, (20 more...)

arXiv.org Artificial Intelligence

2305.1806

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > Singapore (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.88)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

HQP: A Human-Annotated Dataset for Detecting Online Propaganda

Maarouf, Abdurahman, Bär, Dominik, Geissler, Dominique, Feuerriegel, Stefan

arXiv.org Artificial IntelligenceMay-1-2023

Online propaganda poses a severe threat to the integrity of societies. However, existing datasets for detecting online propaganda have a key limitation: they were annotated using weak labels that can be noisy and even incorrect. To address this limitation, our work makes the following contributions: (1) We present HQP: a novel dataset (N=30,000) for detecting online propaganda with high-quality labels. To the best of our knowledge, HQP is the first dataset for detecting online propaganda that was created through human annotation. (2) We show empirically that state-of-the-art language models fail in detecting online propaganda when trained with weak labels (AUC: 64.03). In contrast, state-of-the-art language models can accurately detect online propaganda when trained with our high-quality labels (AUC: 92.25), which is an improvement of ~44%. (3) To address the cost of labeling, we extend our work to few-shot learning. Specifically, we show that prompt-based learning using a small sample of high-quality labels can still achieve a reasonable performance (AUC: 80.27). Finally, we discuss implications for the NLP community to balance the cost and quality of labeling. Crucially, our work highlights the importance of high-quality labels for sensitive NLP tasks such as propaganda detection.

machine learning, natural language, propaganda, (18 more...)

arXiv.org Artificial Intelligence

2304.14931

Country:

Asia > Russia (1.00)
Europe > Russia (0.06)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Government > Military (1.00)
Government > Regional Government > Europe Government > Russia Government (0.94)
Government > Regional Government > Asia Government > Russia Government (0.94)
Media > News (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

The First Rule of Machine Learning: Start without Machine Learning

#artificialintelligenceOct-23-2021, 18:23:41 GMT

Update: This made the top of Hacker News ( 600 points). Applying machine learning effectively is tricky. You need a robust pipeline to support your data flows. And most of all, you need high-quality labels. As a result, most of the time, my first iteration doesn't involve machine learning at all.

learning, machine learning, weak supervision, (12 more...)

#artificialintelligence

Industry: Information Technology (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

Hu, Zehong, Liang, Yitao, Zhang, Jie, Li, Zhao, Liu, Yang

Neural Information Processing SystemsFeb-14-2020, 16:57:37 GMT

Incentive mechanisms for crowdsourcing are designed to incentivize financially self-interested workers to generate and report high-quality labels. Existing mechanisms are often developed as one-shot static solutions, assuming a certain level of knowledge about worker models (expertise levels, costs for exerting efforts, etc.). In this paper, we propose a novel inference aided reinforcement mechanism that acquires data sequentially and requires no such prior assumptions. Specifically, we first design a Gibbs sampling augmented Bayesian inference algorithm to estimate workers' labeling strategies from the collected labels at each step. Then we propose a reinforcement incentive learning (RIL) method, building on top of the above estimates, to uncover how workers respond to different payments.

Add feedback

Systematic Analysis of Output Agreement Games: Effects of Gaming Environment, Social Interaction, and Feedback

Huang, Shih-Wen (University of Illinois at Urbana-Champaign) | Fu, Wai-Tat (University of Illinois at Urbana-Champaign)

AAAI ConferencesJul-21-2012

We report results from a human computation study that tests the extent to which output agreement games are better than traditional methods in terms of increasing quality of labels and motivation of voluntary workers on a task with a gold standard. We built an output agreement game that let workers recruited from Amazon's Mechanical Turks label the semantic textual similarity of 20 sentence pairs. To compare and test the effects of the major components of the game, we created interfaces that had different combinations of a gaming environment (G), social interaction (S), and feedback (F). Our results show that the main reason that an output agreement game can collect more high-quality labels is the gaming environment (scoring system, leaderboard, etc). On the other hand, a worker is much more motivated to voluntarily do the task if he or she can do it with another worker (i.e., with social interaction). Our analysis provides human computation researchers important insight on understanding how and why the method of Game with a Purpose (GWAP) can generate high-quality outcomes and motivate more voluntary workers.

gaming environment, high-quality label, interface, (14 more...)

AAAI Conferences

Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.83)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.35)

Add feedback

Filters

Collaborating Authors

high-quality label

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

Weakly Supervised Framework Considering Multi-temporal Information for Large-scale Cropland Mapping with Satellite Imagery

Balancing Label Quantity and Quality for Scalable Elicitation

Mining Negative Temporal Contexts For False Positive Suppression In Real-Time Ultrasound Lesion Detection

HQP: A Human-Annotated Dataset for Detecting Online Propaganda

The First Rule of Machine Learning: Start without Machine Learning

Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

Systematic Analysis of Output Agreement Games: Effects of Gaming Environment, Social Interaction, and Feedback