Online Active Model Selection for Pre-trained Classifiers

Karimi, Mohammad Reza, Gürel, Nezihe Merve, Karlaš, Bojan, Rausch, Johannes, Zhang, Ce, Krause, Andreas

arXiv.org Machine Learning 

Model selection from a set of pre-trained models is an emerging problem in machine learning and has implications in several practical scenarios. Industrial examples include cases in which a telecommunication company or a flight booking company have multiple ML models trained over different sliding windows of data and hope to pick the one that performs the best on a given day. For many real-world problems, unlabeled data is abundant and can be inexpensively collected, while labels are expensive to acquire and require human expertise. Consequently, there is a need to robustly identify the best model under limited labeling resources. Similarly, one often needs reasonable predictions for the unlabeled data while keeping the labeling budget low. Depending on data availability, one can consider two settings: the pool-based setting assumes that the learner has access to a pool of unlabeled data, and she tries to select informative data samples from the pool to achieve her task. The online (streaming) setting works with a stream of data, where the data arrives one at a time, and the learner decides to ask for the label of the data samples on the go or to just throw the sample away. While offering less options on which data to label next, this streaming setting alleviates the scalability challenge of storing and processing a large pool of examples in the pool-based setting. Another important aspect is the nature of the data: the instance/label pairs might be sampled i.i.d.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found