Convergence Rates of Active Learning for Maximum Likelihood Estimation

Chaudhuri, Kamalika, Kakade, Sham, Netrapalli, Praneeth, Sanghavi, Sujay

Jun-8-2015–arXiv.org Machine Learning

In active learning, we are given a sample space X, a label space Y, a class of models that map X to Y, and a large set U of unlabelled samples. The goal of the learner is to learn a model in the class with small target error while interactively querying the labels of as few of the unlabelled samples as possible. Most theoretical work on active learning has focussed on the PAC or the agnostic PAC model, where the goal is to learn binary classifiers that belong to a particular hypothesis class [2, 13, 9, 6, 3, 4,22], andtherehasbeenonlyahandful ofexceptions[19, 8,20]. Inthispaper, weshift ourattention to a more general setting - maximum likelihood estimation (MLE), where Pr(Y X) is described by a model θ belonging to a model class Θ. We show that when data is generated by a model in this class, we can do active learning provided the model class Θ has the following simple property: the Fisher information matrix for any model θ Θ at any (x,y) depends only on x and θ. This condition is satisfied in a number of widely applicable model classes, such as Linear Regression and Generalized Linear Models (GLMs), which in turn includes models for Multiclass Classification and Conditional 1 Random Fields. Consequently, we can provide active learning algorithms for maximum likelihood estimation in all these model classes. The standard solution to active MLE estimation in the statistics literature is to select samples for label query by optimizing a class of summary statistics of the asymptotic covariance matrix of the estimator [5]. The literature, however, does not provide any guidance towards which summary statistic should be used, or any analysis of the solution quality when a finite number of labels or samples are available.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

Jun-8-2015

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)
  - Machine Learning > Learning Graphical Models
    - Directed Networks > Bayesian Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found