Learning Supervised Topic Models for Classification and Regression from Crowds

Rodrigues, Filipe, Lourenço, Mariana, Ribeiro, Bernardete, Pereira, Francisco

Aug-17-2018–arXiv.org Machine Learning

Hence, it is seldom the case where a single oracle labels an entire collection. Furthermore, the Web, through its social nature, also exploits the wisdom of crowds to annotate large collections of documents and images. By categorizing texts, tagging images or rating products and places, Web users are generating large volumes of labeled content. However, when learning supervised models from crowds, the quality of labels can vary significantly due to task subjectivity and differences in annotator reliability (or bias) [9], [10]. If we consider a sentiment analysis task, it becomes clear that the subjectiveness of the exercise is prone to generate considerably distinct labels from different annotators. Similarly, online product reviews are known to vary considerably depending on the personal biases and volatility of the reviewer's opinions. It is therefore essential to account for these issues when learning from this increasingly common type of data. Hence, the interest of researchers on building models that take the reliabilities of different annotators into consideration and mitigate the effect of their biases has spiked during the last few years (e.g.

annotator, machine learning, natural language, (13 more...)

arXiv.org Machine Learning

Aug-17-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County > Cambridge (0.04)
- Europe
  - Portugal > Coimbra
    - Coimbra (0.05)
  - Denmark > Capital Region
    - Kongens Lyngby (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - Singapore (0.04)

Genre:
- Research Report (1.00)

Industry:
- Media (0.68)

Technology:
- Information Technology
  - Communications (1.00)
  - Artificial Intelligence
    - Natural Language > Discourse & Dialogue (0.86)
    - Representation & Reasoning > Uncertainty
      - Bayesian Inference (0.68)
    - Machine Learning
      - Statistical Learning > Regression (0.93)
      - Learning Graphical Models (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found