On the Complexity of Learning a Class Ratio from Unlabeled Data

Dec-17-2020–Journal of Artificial Intelligence Research

In the problem of learning a class ratio from unlabeled data, which we call CR learning, the training data is unlabeled, and only the ratios, or proportions, of examples receiving each label are given. The goal is to learn a hypothesis that predicts the proportions of labels on the distribution underlying the sample. This model of learning is applicable to a wide variety of settings, including predicting the number of votes for candidates in political elections from polls. In this paper, we formally define this class and resolve foundational questions regarding the computational complexity of CR learning and characterize its relationship to PAC learning. Among our results, we show, perhaps surprisingly, that for finite VC classes what can be efficiently CR learned is a strict subset of what can be learned efficiently in PAC, under standard complexity assumptions. We also show that there exist classes of functions whose CR learnability is independent of ZFC, the standard set theoretic axioms. This implies that CR learning cannot be easily characterized (like PAC by VC dimension).

proportion, subset, vc dimension, (16 more...)

Journal of Artificial Intelligence Research

Dec-17-2020

Journals PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - United States
    - Nebraska > Douglas County
      - Omaha (0.04)
    - Illinois > Cook County
      - Chicago (0.04)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
    - California > San Francisco County
      - San Francisco (0.14)
  - Canada > Quebec
    - Montreal (0.04)
- Europe
  - United Kingdom
    - Scotland > City of Edinburgh
      - Edinburgh (0.04)
    - England > Cambridgeshire
      - Cambridge (0.04)
  - Greece > Attica
    - Athens (0.04)
- Asia
  - Singapore (0.04)
  - Middle East > Israel
    - Haifa District > Haifa (0.04)
  - China > Beijing
    - Beijing (0.04)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Computational Learning Theory (0.91)
  - Unsupervised or Indirectly Supervised Learning (0.71)
  - Learning Graphical Models > Directed Networks
    - Bayesian Learning (0.67)