Reviews: Semi-crowdsourced Clustering with Deep Generative Models

Oct-7-2024, 08:42:28 GMT–Neural Information Processing Systems

A complex DGM is proposed that jointly models observations with crowdsourced annotations of whether or not two observations belong to the same cluster. This allows crowdsourcing non-expert annotations to help with clustering complex data. Importantly, the model is developed for the semi-supervised case, i.e., annotations are only observed for a small proportion of observation pairs. The authors propose a hierarchical VAE structure to model the observations, with a discrete latent-variable z \sim p(z \pi), a continuous latent variable x \sim p(x z), and observed data o \sim p(o x). This is paired with a two-coin David-Skene model which is conditioned on the mixture variable z for annotations: L \sim p(L z_i, z_j, \alpha, \beta), where \alpha and \beta are annotator-specific latent variables that model the "expertise" of the m_th annotator (precision and recall parameters, respectively). To the best of my understanding, through the dependence of the two-coin model on the latent mixture association, though it is not explicitly stated in the paper, z represents cluster association in the model.

annotation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Oct-7-2024, 08:42:28 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Communications > Social Media
    - Crowdsourcing (0.95)
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning > Generative AI (0.40)