Online Active Model Selection for Pre-trained Classifiers

Karimi, Mohammad Reza, Gürel, Nezihe Merve, Karlaš, Bojan, Rausch, Johannes, Zhang, Ce, Krause, Andreas

Oct-21-2020–arXiv.org Machine Learning

Model selection from a set of pre-trained models is an emerging problem in machine learning and has implications in several practical scenarios. Industrial examples include cases in which a telecommunication company or a flight booking company have multiple ML models trained over different sliding windows of data and hope to pick the one that performs the best on a given day. For many real-world problems, unlabeled data is abundant and can be inexpensively collected, while labels are expensive to acquire and require human expertise. Consequently, there is a need to robustly identify the best model under limited labeling resources. Similarly, one often needs reasonable predictions for the unlabeled data while keeping the labeling budget low. Depending on data availability, one can consider two settings: the pool-based setting assumes that the learner has access to a pool of unlabeled data, and she tries to select informative data samples from the pool to achieve her task. The online (streaming) setting works with a stream of data, where the data arrives one at a time, and the learner decides to ask for the label of the data samples on the go or to just throw the sample away. While offering less options on which data to label next, this streaming setting alleviates the scalability challenge of storing and processing a large pool of examples in the pool-based setting. Another important aspect is the nature of the data: the instance/label pairs might be sampled i.i.d.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

Oct-21-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Wisconsin > Dane County
    - Madison (0.14)
  - Virginia > Arlington County
    - Arlington (0.04)
  - New York > New York County
    - New York City (0.04)
  - Louisiana > Orleans Parish
    - New Orleans (0.04)
  - California > San Francisco County
    - San Francisco (0.14)
- Europe > Switzerland
  - Zürich > Zürich (0.04)
- Asia > Middle East
  - Qatar > Ad-Dawhah > Doha (0.04)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Telecommunications (1.00)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.46)
  - Artificial Intelligence > Machine Learning
    - Statistical Learning (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found