Human in the loop: How to effectively create coherent topics by manually labeling only a few documents per class

Thielmann, Anton, Weisser, Christoph, Säfken, Benjamin

Dec-19-2022–arXiv.org Artificial Intelligence

Few-shot methods for accurate modeling under sparse label-settings have improved significantly. However, the applications of few-shot modeling in natural language processing remain solely in the field of document classification. With recent performance improvements, supervised few-shot methods, combined with a simple topic extraction method pose a significant challenge to unsupervised topic modeling methods. Our research shows that supervised few-shot learning, combined with a simple topic extraction method, can outperform unsupervised topic modeling techniques in terms of generating coherent topics, even when only a few labeled documents per class are used.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Dec-19-2022

arXiv.org PDF

Add feedback

Country:
- Europe > Germany
  - Lower Saxony > Gottingen (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Statistical Learning
    - Clustering (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found